Skip to content

gds-pagerank updates#185

Open
mohammad-ashraf-187 wants to merge 6 commits intotg_4.3.0_devfrom
ashraf-gds-updates
Open

gds-pagerank updates#185
mohammad-ashraf-187 wants to merge 6 commits intotg_4.3.0_devfrom
ashraf-gds-updates

Conversation

@mohammad-ashraf-187
Copy link
Copy Markdown

  1. Created baseline setup for pagerank algorithm.
  2. Enhanced README file for pagerank.
  3. Imported baseline code for betweeness centrality (pulled gds-corrections).
  4. Fixed some minor bugs and stability issues.


if __name__ == "__main__":
degree_cent_baseline.run()
fast_rp_baseline.run()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: will we cover the pagerank_baseline here as well? I saw that it was imported but the pagerank_baseline.run() command weren't being called yet.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we will be covering all the baseline including pagerank...

python3 test/baseline/create_baselines.py &&
pytest test/test_centrality.py #test/test_ml.py
# python3 test/setup.py &&
python3 test/baseline/create_baselines.py #&&
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: is it the intention here to combine all testing runs into the create_baselines.py file?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

create_baseline.py file is for creating the baseline setup for all algorithm,currently we have only for degree cnetrality and fastrp.

to run test we have pytest test/test_centrality.py this will test the centrality algorithms based on the baseline created in create_baseline.py file. similarly we will be running pytest for test_classification,test_community,test_path_finding and etc.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it - in that case should we uncomment at least the pytest test/test_centrality.py so that the test would be able to be run?

python-dotenv==1.0.1
pyTigerGraph==1.5.2
pyTigerGraph[gds]==1.8.0
pyTigerGraph==1.5.2 # use updated version when available on PyPI inorder to fix any issues
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use a newer version of pyTigerGraph here? Might want to coordinate this somewhat with @chengbiao-jin - I saw that as of 2 April, pyTigerGraph is currently at release 2.0.1

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes we can

kiwisolver==1.4.5
MarkupSafe==2.1.5
matplotlib==3.9.0
matplotlib==3.10.8
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for upgrading these to latest versions! 🙏

Maybe we should consider updating some other libraries too (unless some libraries require a specific version that's older)?

@@ -1,5 +1,5 @@
# Use an official Python runtime as a parent image
FROM python:3.9-slim
FROM python:3.11-slim
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Let's perhaps use python 3.14 here since python 11 will reach end-of-life in 2027?

Per: https://devguide.python.org/versions/

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is good idea, we need to stick to one python version so that it would help in development and also in production environmnet.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question - where is this result being obtained from? Is this result from running networkx betweenness centrality, implementation of TG betweenness centrality logic in python, or is it TG betweenness centrality in GSQL?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are the baseline result for betweenness algorithm usually from the networkx not from TG, while running the pytest we will be comparing the result of tg result with this baseline result.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question - is this file created so that it'll pass the test as seen here?

with gzip.open(f"{baseline_path_root}/ml/fastRP.json.gz", "rb") as f:

If we aren't working on fastrp testing at the moment, I think it might be good to skip the fastrp test for now (maybe with some pytest skip and some comments.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think @abrahamchandy95 can give answer to this.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the README here for PageRank algorithm! 🙏

I did notice that some information here have overlaps with the documentation page:
https://docs.tigergraph.com/graph-ml/3.10/centrality-algorithms/pagerank

So, I was wondering what's the best option we should do in this case to avoid confusion:

Note: I'm fine with having readme file in here for now - but considering that the work might have to be done for all the queries here, I feel like we might want to have coordination with documentation team (or who is in charge of the documentation page around Graph Algorithms) as well around this to see what's the best place to put information about the algorithms into.

| Official PageRank Documentation | [docs.tigergraph.com](https://docs.tigergraph.com/graph-ml/current/centrality-algorithms/pagerank) |
| Algorithm Changelog | [CHANGELOG.md](https://github.com/tigergraph/gsql-graph-algorithms/blob/master/algorithms/Centrality/pagerank/CHANGELOG.md) |
| Full GDS Algorithm Library | [gsql-graph-algorithms](https://github.com/tigergraph/gsql-graph-algorithms)
| Community Forum | [community.tigergraph.com](https://community.tigergraph.com) |
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while this URL still works (since it does the re-direct to the other one), the current dev forum, as far as I know, is at this website here: https://dev.tigergraph.com/forum/

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for providing updated link.

Comment on lines +101 to +111
### Option 1 — Via TigerGraph CLI

```bash
$ tg box algos install <Pagerank Algorithm>
$ tg box algos install <Algorithm>
```

#### Via GSQL terminal
**Example:**

```bash
$ tg box algos install tg_pagerank
```
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm the current tg box command doesn't work, so I was thinking that e should remove this option for installing the algorithms

CC @JonHerke-TG - what do you think? I think it might have worked for an older version of TG, but I never tried using tg box currently.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tg box command is available in Tigergraph Dev lab github repo.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm I just feel like the tg box one required some "more setup" than other methods and my concern was it wasn't that updated: https://github.com/TigerGraph-DevLabs/TigerGraph-CLI

It also wasn't that much simpler than the other ways of loading the algorithms too, which was why I'm fine with opting out of the tg box command.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants