- Sponsor
-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VFS URI reimplementation #2361
Comments
|
Replying to andrew_b (#2361):
Well, this is not good solution due to an extra token in path. That extra token makes the path parsing more complicated.
The better solutions are |
oioi, deep waters ahead ... :D
first of all, let's get the terminology clear. mc's vfs paths are *not* URIs in an RFC sense - and it's good that way, as it's optimized for local file names which are by far the most common use case.
i don't think we actually need the "vfs:" prefix - it just adds verbosity. i think /# followed by the vfs name (and a colon if parameters follow) is a sufficiently distinct magic. for the rare case where this would cause a false hit, a /#literal:<original> pseudo-vfs could be introduced.
to determine appropriate quoting/escaping rules, try constructing a path which represents a file inside a bzip-compressed tar inside an encrypted rar which resides in a subdirectory of an ftp-server which needs authentication. afterwards you try something really weird. :D |
Replying to ossi:
Yes, if you mean RFC2396. But it is Uniform Resource Identifier within MC VFS.
avfs created that incompatibility for years ago and I don't care about that now.
#2230? #a, #rpm can be a real file name. #rpms can be a real directoriy name. I want to decrease potential false hits and some verbosity like usage of #vfs: prefix will help us to do that. |
Replying to andrew_b:
the first problem is that the #a# is parsed as a vfs entry even though it clearly does not match the regexp /#(vfs1|vfs2|...)(:|/|$) (as implied by my first comment). |
fwiw, we should probably use a less verbose escape - simply doubling the hash would already work: /#a/ would become /##a/. whether the the escape should be applied to uris which don't strictly need it (because they don't match the above regexp, like the /#a# example) is a matter of taste, but _must_ be determined upfront - i think i'd enforce it for uniformity (and it makes the parser simpler, because it doesn't have to do look-ahead). |
Created 2361_vfs_uri branch. Parent branch is master. |
Create test environment:
Testing in mc: |
Replying to slavazanko:
Fixed. |
why did you completely disregard my comments about proper escaping and optimizing the common case? |
You can use sh://somehost. |
as far as i'm concerned, that doesn't exactly emphasize the U in URI.
also, the vfs: syntax is simply a gratuitous incompatible change of long-standing behavior. i've yet to see a compelling justification for it. |
ossi, can you help with writing new URI VFS parser as you like? |
you mean, like, more than reviewing? maybe when i find a long enough free timeslice while my brain is not toasted from work yet. ;)
reviewing the existing branch is quite a nightmare due to a lot of mixed in coding style cleanups ... you should take some inspiration from http://qt.gitorious.org/qt/pages/CommitPolicy point 8.
anyway, on the theory of the necessary changes:
if you want to start from the existing branch, first you need to get rid of the vfx prefix changes, leaving only the refactoring in. i'd probably throw away the commits and extract interesting parts from the cumulative diff with git gui.
you need a single structure to represent a parsed URI, as a list of segments - plain path segments and vfs root segments. then you need a function which parses URI strings (using callbacks to registered vfs handlers) and unescapes literal hashmarks, and a function which assembles the parts back to an URI (again using callbacks), including escaping hashmarks. a single entry point per direction ensures consistent handling.
i wonder what to do about the file#utar syntax. i'd personally get rid of it and require file/#utar (which, btw, kinda works until you try to leave the archive), but this cannot be done without a deprecation phase of the old syntax. but this is non-critical anyway - when hashmarks are properly escaped, a single hashmark could actually still be used as hierarchy delimiter, only that it looks weird (imagine #ftp:foo:b##r@host/yo/my#utar/thefile#urar:passwd/holla/die/wald##fee.gz#gz vs. /#ftp:foo:b##r@host/yo/my/#utar/thefile/#urar:passwd/holla/die/wald##fee.gz/#gz - i find the latter way more readable). |
|
Branch [2361_vfs_uri] have been totally rewritten.
New initial [a89c7d252dd34af26473459e5ccf8d0bd2870821]
Review, please. |
Note: VFS paths now handled as
vfsprefix1://vfsdata/vfsprefix2://vfsdata
Also, 'bindings' user file was renamed to 'mc.ext', so you need search in this file all
and replace them to
After this you should rename your 'bindings' file to 'mc.ext'
Old-style paths are handled too, but you couldn't mix URL-like and old style path elements in one path string.
Support of old-style paths will be removed in next major release (probably in 4.9, who knows...)
Be aware. |
i'll do a proper review in the next days.
why did you introduce the new syntax? a) it's noisier (see my previous arguments) b) it pretends to be something which it isn't ("proper" URLs) c) it is an apparently gratuitous change (i didn't see the commit message explain an actual need to do it) |
|
The new syntax more reliable to parse path string and easily for reading by humans :) |
care to back up that claim with some examples and counter-examples? |
|
Merge [f2ebbd2]
For getting list of commits in branch type:
|
|
$ cd
$ cd |
well, do we need to drop old-style path parser? |
Replying to slavazanko:
I think, yes. |
I think, need to leave old-style pathes just in one place: in "Directory hotlist". |
|
|
Support of relative symlinks was broken in new vfs. |
Created branch 2361_url_path
Initial [e6c25891e7b5cd8a0b8cae0d566e5c1c68a52f7a]
Review, please. |
|
|
|
Merge [906f688]
For getting list of commits type:
|
|
Important
This issue was migrated from Trac:
andrew_b
(@aborodin)zaytsev
(@zyv)ipv6
,URL
VFS used in mc has some leaks in design. For instance, the VFS URI contains a VFS name: testcase.tar#utar. In such case VFS name looks like a part of file name. Such desision a) limits a usage of # and @ symbols in filenames and passwords and b) is a root of many problems in URI parsing (see tickets in Blocking: field).
The new VFS URI is proposed: create a special path element /#vfs:vfsname/ that contains only vfs name with #vfs: prefix (like path encoding /#enc:encodingname/). This scheme simplifies the VFS URI parsing and removes the limitation of @ and # symbols usage in file names and passwords.
Examples:
/some/path/#vfs:patchfs/foo.diff
/#vfs:ftp/user:password@host/path/file
The text was updated successfully, but these errors were encountered: