-
-
Notifications
You must be signed in to change notification settings - Fork 26.1k
add an autogenerated sitemap #14295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add an autogenerated sitemap #14295
Conversation
Here is the sitemap generated by this PR: https://63867-843222-gh.circle-artifacts.com/0/doc/sitemap.xml |
What are the implications of this? Does it mean Google will disregard its
indexes of old versions? Of dev/?
Does the sitemap also inform Google about the main subsections of our web
site? Should we engineer that more carefully?
|
The sitemap is not going to prevent google from indexing the older pages or the dev version, since there are links to them on the web anyway. However, it'll tell google that this is what we intend to have indexed, and it influences the algorithm [link]:
Also, the sphinx-sitemap plugin supports different versions, which also means we can put all the versions in different sitemap files and let it tell google about all of them, but I think this is a good start. |
Not sure how to proceed from here. |
We could add a sitemap, but I don't think it would directly help with the parent issue. https://webmasters.stackexchange.com/questions/99867/how-to-correctly-mark-up-different-versions-of-the-same-document-which-are-non-c hints that it's a non trivial problem. |
Generally scikit-learn pages are strongly interlinked and search engine should be able to reconstruct the sitemap from links. I would rather rely on sphinx producing search engine friendly structure, than do part of its work with a sitemap. As to the older scikit-learn versions maybe rather a banner on the top indicating that a newer version exists? That's what RDT is doing (I think) or at least some other documentations sites that use sphinx. |
The issue is that the search engines find the pages too well, as a result list the old versions. Ideally the search engine would deprioritize the old versions, which is what the purpose of this PR is. I don't think this would make search engines not index the old pages, but it may make them not list them too high up in the list. |
I also think we don't have too much of an issue with making sure we let search engines see the old versions, since we don't really actively support them anyway. |
Maybe generate a |
I think that would be more drastic. We probably still want people to find things which are only available on the older versions, don't we? (I'm really not sure myself TBH) |
Hmm how many people google "DecisionTreeClassifier 0.18" D: We can restrict the crawler to the last few versions? |
lol, sure, makes sense. Wanna submit a PR? It would also supersede this I
guess then.
…On Wed, Aug 7, 2019 at 5:55 PM Thomas J Fan ***@***.***> wrote:
Hmm how many people google "DecisionTreeClassifier 0.18" D:
We can restrict the crawler to the last few versions?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#14295?email_source=notifications&email_token=AAMWG6FJQSRDTGAG5VHE7WDQDLWA5A5CNFSM4H64M6F2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3Y34LY#issuecomment-519159343>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAMWG6HSBBDEXR3PQBEJ65TQDLWA5ANCNFSM4H64M6FQ>
.
|
@rth wrote:
|
This doesn't seem to have a priority, closing. |
Fixes #13518.
Uses
sphinx-sitemap
to generate a sitemap from the stable version only.