1

Let's say I have supernodes with many edges and would like to quickly return top N edges for a given node. How can I do it with ArangoDB Vertex Centric Index https://docs.arangodb.com/3.11/index-and-search/indexing/working-with-indexes/vertex-centric-indexes/?

I can create skiplist Vertex Centric Index

arangosh> db.collection.ensureIndex({ type: "skiplist", fields: [ "_from", "points" ] })

but the optimiser does not pick it up with sort query

FOR edge IN collection
  FILTER edge._from == "vertices/123456" 
  SORT edge.points DESC
  LIMIT 0, 10
  RETURN edge

It also seems that arango optimizer does not pick up skiplist Vertex Centric Index in traversal syntax however documentation says it should:

FOR v, e, p IN 3..5 OUTBOUND @start GRAPH @graphName
  FILTER p.edges[*].points ALL >0
  RETURN v
4
  • Does it pick up the index if you change the traversal depth to 1..5 or 1..1?
    – CodeManX
    Commented Dec 18, 2017 at 9:21
  • nope. It does not
    – irriss
    Commented Dec 18, 2017 at 10:00
  • Please report this on GitHub: github.com/arangodb/arangodb/issues/new. Include the software version and if possible the dataset. It can be important to use the exact same data with a certain value distribution, because vertex centric indices are not always preferred over the default edge index based on the selectivity estimates.
    – CodeManX
    Commented Dec 18, 2017 at 10:51
  • done github.com/arangodb/arangodb/issues/4076
    – irriss
    Commented Dec 18, 2017 at 15:10

1 Answer 1

0

quickly return top N edges for a given node

It would be better to start with the node:

FOR v, e IN 1..1 ANY @start @edges
  SORT e.points DESC
  LIMIT 10
  RETURN e

This should be about as good as you can get with the current version (3.3) of ArangoDB, assuming you let ArangoDB index _from -- I doubt that adding a skiplist for .points is going to make any (beneficial) difference, unless perhaps you use it in a FILTER.

(I believe that indexing _from using a skiplist would be unwise here. If edges is an Edges collection, it will already be properly indexed.)

2
  • Unfortunately, it does not work like that. Using standard edge index it can find all edges for given node very quickly but then it will have to iterate and sort each of them. In case of a supernode it may take seconds.
    – irriss
    Commented Dec 18, 2017 at 14:17
  • @Ruslan - I have tried to clarify my answer. If you can add a FILTER (e.g. .points > 0) then of course that might help. Could you tell us how many edges there are at the node in question, how long your various queries take, and how long the query I've proposed takes?
    – peak
    Commented Dec 18, 2017 at 15:24

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.