Qual è il modello di espressione regolare corretta per abbinare un nome di file VMS?
-
11-10-2019 - |
Domanda
La documentazione a http: //h71000.www7.hp .com / doc / 731final / documentazione / pdf / ovms_731_file_app.pdf (sezione 5-1) dice che il nome del file dovrebbe essere simile a questo:
node::device:[root.][directory-name]filename.type;version
La maggior parte di loro sono opzionali (come nodo, dispositivo, versione) - non so quali e come scrivere in modo corretto questo in un regexp, (compreso il nome della directory):
DISK1:[MYROOT.][MYDIR]FILE.DAT
DISK1:[MYDIR]FILE.DAT
[MYDIR]FILE.DAT
FILE.DAT;10
NODE::DISK5:[REMOTE.ACCESS]FILE.DAT
Soluzione
Vedere la documentazione e la fonte per la VMS :: Filespec Perl modulo.
Altri suggerimenti
From wikipedia, the full form is actually a bit more than that:
NODE"accountname password"::device:[directory.subdirectory]filename.type;ver
This one took a while, but here is an expression that should accept all valid variations, and place the components into capture groups.
(?:(?:(?:([^\s:\[\]]+)(?:"([^\s"]+) ([^\s"]+)")?::)?([^\s:\[\]]+):)?\[([^\s:\[\]]+)\])?([^\s:\[\]\.]+)(\.[^\s:\[\];]+)?(;\d+)?
Also, from what I can tell, your example of
DISK1:[MYROOT.][MYDIR]FILE.DAT
is not a valid name. I believe only one pair of brackets are allowed. I hope this helps!
You could probably come up with a single complicated regex for this, but it will be much easier to read your code if you work your way from left to right stripping off each section if it is there. The following is some Python code that does just that:
lines = ["DISK1:[MYROOT.][MYDIR]FILE.DAT", "DISK1:[MYDIR]FILE.DAT", "[MYDIR]FILE.DAT", "FILE.DAT;10", "NODE::DISK5:[REMOTE.ACCESS]FILE.DAT"]
node_re = "(\w+)::"
device_re = "(\w+):"
root_re = "\[(\w+)\.]"
dir_re = "\[(\w+)]"
file_re = "(\w+)\."
type_re = "(\w+)"
version_re = ";(.*)"
re_dict = {"node": node_re, "device": device_re, "root": root_re, "directory": dir_re, "file": file_re, "type": type_re, "version": version_re}
order = ["node", "device", "root", "directory", "file", "type", "version"]
for line in lines:
i = 0
print line
for item in order:
m = re.search(re_dict[item], line[i:])
if m is not None:
print " " + item + ": " + m.group(1)
i += len(m.group(0))
and the output is
DISK1:[MYROOT.][MYDIR]FILE.DAT
device: DISK1
root: MYROOT
directory: MYDIR
file: FILE
type: DAT
DISK1:[MYDIR]FILE.DAT
device: DISK1
directory: MYDIR
file: FILE
type: DAT
[MYDIR]FILE.DAT
directory: MYDIR
file: FILE
type: DAT
FILE.DAT;10
file: FILE
type: DAT
version: 10
NODE::DISK5:[REMOTE.ACCESS]FILE.DAT
node: NODE
device: DISK5
directory: REMOTE.ACCESS
file: FILE
type: DAT