[SVCS-234][SVCS-21][SVCS-273] Fix resolve_to_ids to better disambiguate paths, extensions etc.#210
Open
Johnetordoff wants to merge 2 commits intoCenterForOpenScience:developfrom
Open
Conversation
cslzchen
requested changes
Jun 27, 2018
Contributor
There was a problem hiding this comment.
This is a review for Campfire ⛺️ 🔥 .
This PR involves four tickets/PRs.
- SVCS-234 is the original ticket for this PR.
- SVCS-256 solves the root problem which will deprecate this PR.
- Both SVCS-21 and SVCS-273 have their own PR and has been merged.
I suggest we go with SVCS-256 the fix directly and deprecate this one.
- SVCS-256 fix the root bug, and I think it is the right fix.
- SVCS-256 fix is far simpler than this one.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
This arose out of the confluence of three tickets:
After trying to resolve each individually I decided they had to be fixed together within the same function. Since Gdrive doesn't really use a file tree structure, we have to do some detective work to disambiguate the path we are given. There are many edge cases and tricky situations, but this should solve the solvable ones. Here are the two basic problems with passing only a path to GDrive.
The extension problem:
Files which are unique to Gdrive (gdocs, gsheets, ext) can have the same name so for example a gdoc and gsheet can both be called "Test" with no extension. So in order to tell the difference between these files we require the user to add supply an extension (Test.gdoc for gdocs, Test.gsheet for gsheets, etc.) We use the mimeType in the metadata to pick out the correct file. There are exceptions to this rule, if the user doesn't have permission to download the file they can't use the extension.
The slash problem:
When we get the path, we divide it into parts by splitting the path by the '/' characters. Example: 1/2/3 is split into parts "1", "2", "3", but this creates a problem if a file contains a '/', Example 1/2/3 is actually "1", "2/3". We resolve this problem by assuming there are no folders slashes in them, so any time a folder.
Changes
Added utility functions for sorting json response items, reworked resolve_to_ids function to better parse path and find files by extension fixed tests and added comments.
Side Effects
Everything works aside from '/' containing folder names.
Tickets / PRs
Support slashes in google drive file names
Google Docs (and spreadsheets, presentations) are accessible w/ and w/o extension
PR: [SVCS-21] Add exception to prevent accessing gdoc files without file extension #195
Some read-only GDocs on GoogleDrive show up as 'Not Found'
Google Drive .g files of the same name do not display correctly
Found this one after I put in this PR should be closable if this goes through.