Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The model I was talking about is sometimes called a bidirected sequence graph. It's another member of a more general graph family de Bruijn graphs also belong to.

In that model, nodes have separate sets of left edges and right edges, and the edges connect node sides rather than nodes. For example, there can be an edge between the right sides of two nodes. The edges are undirected, but you can't exit a node from the side you entered it. Alternatively, the edges become directed once you fix the orientation of the node visit. Then the successors of a node in one orientation are its predecessors in the other orientation. Some graph representations have an underlying directed graph with separate nodes for the two orientations, but that's an implementation detail people usually don't want to think about.

The path-centric model can be thought as predicting the token preceding/following a context. Node D may have right edges to >E, <F, and >G, but if you are in context <B>C>D, only >E and <F are available. And if you extend the context to the left to >A<B>C>D, then your only option may be <F. This is a primitive operation that may be repeated a million times per CPU-second in a graph with hundreds of millions of nodes. You often don't know which extensions you are going to take until you have processed the sequences in the previous ones.

Matrices are a useful model when you want to do similar things to most nodes in a graph. But when you are exploring the graph locally in an iterative fashion, it's more convenient to think about nodes and edges.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: