图在实际场景中的应用之wordladder-0.2


继续上次0.1的讨论。

在上个博客中,我提到,这个代码是错误的,原因就在于逆向探测部分存在的问题,那么,是什么问题呢??我们来分析下

首先,新的图:


对比0.1中的图,会发现,我们增加了H,I,J三个节点,并且,C有两个父亲(B和D),C又有两个孩子(F和H,这点与0.1不同,0.1中C只有F这个子节点),正是因为C多了H这个子节点,导致结果就完全不同。我们先按照旧的递归探测的代码进行说明。

   1. 第一次完整递归,有顺序G->J->I->H->C->D->B->A,A就是起点,至此,一次完整的探测结束,接下来开始回溯。

弹出A,到B;弹出B,B没有叔叔节点,递归结束,到D;弹出D,D没有叔叔节点,递归结束,到C;弹出C,C有叔叔节点B。

    2.此时,集合中的元素是G->J->I->H,压入C,这时候,新一轮的完整路径是G->J->I->H->C->B->A.(注意,这时候C作为节点,已被访问了2次).接下来开始回溯。

弹出A,到B;弹出B,B没有叔叔节点,递归结束,到C;弹出C,此时C再没有多余的叔叔节点,递归结束,到H;弹出H,H没有叔叔节点,递归结束,到I;弹出I,I没有叔叔节点,递归结束,到J;弹出J,J没有叔叔节点,到G;弹出G,G有叔叔节点E。

    3.此时,集合中的元素空。压入G,这时候,新一轮的完整路径是G->E->D->B->A。接下来开始回溯

弹出A,到B;弹出B,B没有叔叔节点,递归结束,到D;弹出D,D没有叔叔节点,递归结束,到E;弹出E,E没有叔叔节点,到G;G有叔叔节点F。

    4.此时,集合中的元素为空,压入G,这时候,新一轮的完整路径是G->F->E->D->B->A,下面开始回溯

弹出A,到B;弹出B,B没有叔叔节点,递归结束,到D;弹出D,D没有叔叔节点,递归结束,到F;弹出F,F有叔叔节点C,但是C已被访问了2次,根据代码设置,此时无法再进行访问,所以递归结束,到G;G再没有叔叔节点,递归结束。

    看到这里,我们会发现,我们漏了2条路径,分别是 G->F->C->B->A和G->F->C->D->B->A

    所以,关键就在于访问次数的限制,应该调整下。关键是在于叔叔节点的查找,不能单纯的根据访问次数来控制。父节点的访问次数,由子节点的数量决定,而子节点的访问次数,又由子子节点决定。

    并且,我们在刚开始的BFS遍历中,没有排除不同层次的交叉关系,比如有G->E,又有G->F->E,很显然,G可以直接连通E,而不需要再走F这个节点,所以我们修改下BFS的遍历,再调整下根据子节点的父节点找到路径的代码。

    首先,是BFS遍历的调整

新增了新的类成员, map<string, unsigned> m_loops;

    bool FindPathBFS(queue<string> & jobqueue,string end, vector<vector<string> > &final)
    {
        //老老实实使用队列实现BFS
        int times = 0;
        while(!jobqueue.empty())
        {
            queue<string> newqueue;
             ++times;
             
            while(!jobqueue.empty())
            {
                //将队列转移给B队列
          
                //不做递归,那么,路径如何传递
                string begin = jobqueue.front();
                jobqueue.pop();
       
                //从队列中取出元素,遍历他的adj,查询是否符合目标的元素
                list <string> * pedage = graph_addr[begin];
                list<string>::iterator viter;
                //使用m_pos,设置一个节点只能访问一次,会漏掉一些节点
                //如 a->b->c->e 和a->d->c->e,起点是a,终点是e,那么,因为b先访问了c,造成了d访问不了c,因此结果错误了
                //主要是为了防止无向图中,节点的逆向访问.
                    //使用广度优先
                    for(viter = pedage->begin(); viter != pedage->end(); ++viter)
                    { 
                        m_pos[begin][*viter] = 1;
                        
                        if (*viter == end)  //最终结果
                        {
                            //遍历查找,只记录上层节点是谁
                             m_multiparent[*viter].insert(begin);
            
                       }
                       else if ((m_loops.count(*viter) && times != m_loops[*viter]) || m_pos[*viter][begin]) //已访问,则不做处理
                       {
                            ;
                       }
                       else  //放入新队列
                       {
                           //在这里,会更改节点第一次被发现时的父节点,而该父节点是最短路径的父节点
          
                            //在这里,该成支持多个父亲节点
                            m_multiparent[*viter].insert(begin);
                             m_loops[*viter] = times;
                           //m_pos[*viter] = 1;
                           newqueue.push(*viter);
                       }
                 }
            }
            
            jobqueue = newqueue;
        }
        
        vector<string> test;
        m_pos.clear();
        m_nodeall[end] = 1;
        FindPathVector(end, test, final);
        
        return true;
    }

然后接下来是递归查找路径的方法调整。

新增了新的类成员, map<string, unsigned> m_nodeall;

    void FindPathVector(string end, vector<string> &test,vector<vector<string> > &result)
    {
        m_times[end] += 1;
      //  cout<<"Now, I am "<<end<<", and I am looking for my parent,  I have "<<m_multiparent[end].size()<<endl;

        int sun_num_a = graph_addr[end]->size() - m_multiparent[end].size();  //边的数目-作为子节点的边的条数

        int son_num = m_nodeall[end];
        
        if (!m_multiparent.count(end) || 0 == m_multiparent[end].size())
        {
            //本起始节点,则视为begin和end之间不可达
            if (test.size() == 0)
            {
                return ;
            }
            
            test.push_back(end);
            vector<string> vctfinal(test.size());
            //最终得到的结果是倒置的,所以在这里,我们要逆转一下,注意,不能直接操作test本身,因为递归中要用
             reverse_copy(test.begin(), test.end(), vctfinal.begin());
            
            if (result.size() > 0)
            {
               if (result[0].size() > vctfinal.size())
               {
                   result.clear(); //只保留最短的结果
                   result.push_back(vctfinal);
               }
               else if (result[0].size() == vctfinal.size())
               {
                   result.push_back(vctfinal);
                }
             }
             else
             {
                 result.push_back(vctfinal);
             }
             
            test.pop_back();
        }
        else if (m_multiparent[end].size() == 1 || (m_times[end] < m_multiparent[end].size() * son_num))  //长度等于1或长度大于1但是访问次数小于等于总个数
        {
            set<string>::iterator viter;

            for(viter = m_multiparent[end].begin(); viter != m_multiparent[end].end(); ++viter)
            {
                string val = *viter;
                test.push_back(end);
		CalcuteParent(end, val);
                FindPathVector(val, test, result);
                test.pop_back();
            }
        }
    }
    
    int CalcuteParent(string son, string parent)
    {
        if (m_pos.count(son) && m_pos[son].count(parent))
        {
            return 0;
        }
        
        cout<<"son's value is "<<son<<",and size is:"<<m_nodeall[son]<<flush;
        if (!m_nodeall.count(parent))
        {
            m_nodeall[parent] = m_nodeall[son];
        }
        else
        {
            m_nodeall[parent] += m_nodeall[son];
        }

        m_pos[son][parent] = 1;

        return 0;
    }
    

这份代码,可以得到正确的结果,但是在性能上,不管是时间复杂度还是空间复杂度,都耗费很大,导致提交leetcode时,Time Limit Exceeded,追踪了下时间的耗费,80%的时间用于构造图的邻接链表,20%的时间用于进行BFS遍历。这个算法决定了时间复杂度降低不下去,构建图的邻接表没有可以优化的空间.

    我们可以参看leetcode上,59ms解出这个题的代码中找到他们的思路。不过那个算法,不是用的图结构.

优质高效算法代码如下:

#include <iostream>
#include <vector>
#include <string>
#include <map>
#include <list>
#include <set>
#include <queue>
#include <unordered_set>
#include <exception>
#include <fstream>
#include <algorithm>
#include <time.h>

using namespace std;

class Node;

typedef vector<string> Ladder;
typedef unordered_set<string> StringSet;
typedef bool (*NodeCmper) (Node*, Node*);
typedef set<Node*, NodeCmper> NodeSet;


class Node
{
public:
    string word;
    vector<Node*> parents;

    Node(string w) : word(w) {}
    void addparent(Node* parent) { parents.push_back(parent); }

    // Yield all children of this node, and:
    //   1) If the child is found in $targetlayer, which means we found ladders that
    //      connect BEGIN-WORD and END-WORD, then we get all paths through this node
    //      to its ROOT node, and all paths through the target child node to its ROOT
    //      node, and combine the two group of paths to a group of ladders, and append
    //      these ladders to $ladders.
    //   2) Elif the $ladders is empty:
    //       2.1) If the child is found in $nextlayer, then get that child, and add
    //            this node to its parents.
    //       2.2) Else, add the child to nextlayer, and add this node to its parents.
    //   3) Else, do nothing.
    void yieldchildren(NodeSet& nextlayer, StringSet& wordlist, NodeSet& targetlayer,
                       vector<Ladder>& ladders, bool forward)
    {
        string nextword = word;
        for (int i = 0, n = nextword.length(); i < n; i++) {
            char oldchar = nextword[i];
            for (nextword[i] = 'a'; nextword[i] <= 'z'; nextword[i]++) 
            {
                if (wordlist.count(nextword)) //if have next word,that has one different character from me.
                {
                    // now we found a valid child-word, let's yield a child.
                    Node* child = new Node(nextword);
                    yield1(child, nextlayer, targetlayer, ladders, forward);
                }
            }
            nextword[i] = oldchar;
        }
    }


    // yield one child, see comment of function `yieldchildren`
    void yield1(Node* child, NodeSet& nextlayer, NodeSet& targetlayer,
                vector<Ladder>& ladders, bool forward) {
        auto itr = targetlayer.find(child);
        if (itr != targetlayer.end()) //we found the path between Begin and End, so let insert the path
        {
            for (Ladder path1 : this->getpaths())   //this is the destination's parent
            {
                for (Ladder path2 : (*itr)->getpaths()) //path2 normal is just the end word 
                {
                    if (forward) // positive_sign
                    {
                        ladders.push_back(path1);
                        ladders.back().insert(ladders.back().end(), path2.rbegin(), path2.rend());
                    } else //reverse result
                    {
                        ladders.push_back(path2);
                        ladders.back().insert(ladders.back().end(), path1.rbegin(), path1.rend());
                    }
                }
            }
        } else if (ladders.empty()) //not the end, so we put child into nextlayer
        {
            auto itr = nextlayer.find(child);
            if (itr != nextlayer.end()) //nextlayer already has this word
            {
                (*itr)->addparent(this);  //add this as one of child's parents 
            }
            else //nextlayer doesn't has child,so,insert this child
            {
                child->addparent(this);
                nextlayer.insert(child);
            }
        }
    }

    vector<Ladder> getpaths()
    {
        vector<Ladder> ladders;
        if (parents.empty()) {
            ladders.push_back(Ladder(1, word));
        } else {
            for (Node* parent : parents) {
                for (Ladder ladder : parent->getpaths()) {
                    ladders.push_back(ladder);
                    ladders.back().push_back(word);
                }
            }
        }
        return ladders;
    }
};

bool nodecmp(Node* pa, Node* pb)
{
    return pa->word < pb->word;
}

class Solution {
public:
    vector<Ladder> findLadders(string begin, string end, StringSet& wordlist) {
        vector<Ladder> ladders;
        Node headroot(begin), tailroot(end);
        NodeSet frontlayer(nodecmp), backlayer(nodecmp);
        NodeSet *ptr_layerA = &frontlayer, *ptr_layerB = &backlayer;
        bool forward = true;

        if (begin == end) {
            ladders.push_back(Ladder(1, begin));
            return ladders;
        }

        frontlayer.insert(&headroot);
        backlayer.insert(&tailroot);
        wordlist.insert(end);
        while (!ptr_layerA->empty() && !ptr_layerB->empty() && ladders.empty()) {
            NodeSet nextlayer(nodecmp);
            
            if (ptr_layerA->size() > ptr_layerB->size()) //why?
            {
                cout<<"before swap:A"<<endl;
                print(*ptr_layerA);
                cout<<"before swap:B"<<endl;
                print(*ptr_layerB);
                swap(ptr_layerA, ptr_layerB);
                cout<<"after swap:A:"<<endl;
                print(*ptr_layerA);
                cout<<"after swap:B:"<<endl;
                print(*ptr_layerB);
                forward = ! forward;
            }

            for (Node* node : *ptr_layerA) 
            {
                //important, cut the connection between ptr_layerA inner
                wordlist.erase(node->word);
            }

            for (Node* node : *ptr_layerA) {
                node->yieldchildren(nextlayer, wordlist, *ptr_layerB, ladders, forward);
            }
            swap(*ptr_layerA, nextlayer);
        }

        return ladders;
    }

    void print(NodeSet & node)
    {
        for(Node * pnode: node)
        {
            cout<<pnode->word<<"\t";
        }

        cout<<endl;
    }
};

int main()
{
        Solution solu;
        unordered_set<string> second({"si","go","se","cm","so","ph","mt","db","mb","sb","kr","ln","tm","le","av","sm","ar","ci","ca","br","ti","ba","to","ra","fa","yo","ow","sn","ya","cr","po","fe","ho","ma","re","or","rn","au","ur","rh","sr","tc","lt","lo","as","fr","nb","yb","if","pb","ge","th","pm","rb","sh","co","ga","li","ha","hz","no","bi","di","hi","qa","pi","os","uh","wm","an","me","mo","na","la","st","er","sc","ne","mn","mi","am","ex","pt","io","be","fm","ta","tb","ni","mr","pa","he","lr","sq","ye"});
    string begin = "qa";
    string end = "sq";

        vector<vector<string> > result =  solu.findLadders(begin, end, second);

        for(int i = 0; i < result.size(); ++i)
        {
            cout<<"第:"<<i+1<<"个结果是:"<<endl;

            for(int j = 0; j < result[i].size(); ++j)
            {
                cout<< result[i][j]<<"\t";
            }

            cout<<endl;
        }
    
    cout<<"over:"<<(clock()/1000)<<endl;
}

这个算法,很类似我的BFS遍历后的根据节点的父亲,查找整条完整路径的部分。不过他是从一开始,就以起点作为起始点,然后往下查找下一个可以到达的所有节点,并且在这时候就判断节点是否就是结束点,如果是,则打印整条路径,如果不是,则加入next集合中继续下一次的遍历。在遍历之前,先把集合从wordlist中删除,防止出现同一层的节点间,互相连接的情况,比如A->C ,B->C, A->B,当遍历A时,下一个是B和C。当遍历B和C时,从集合中删除B,C,防止B,C互联的情况。





评论

发表评论