Python中fnmatch模块的使用

fnmatch()函数匹配能力介于简单的字符串方法和强大的正则表达式之间,如果在数据处理操作中只需要简单的通配符就能完成的时候,这通常是一个比较合理的方案。此模块的主要作用是文件名称的匹配,并且匹配的模式使用的Unix shell风格。源码很简单:

"""Filename matching with shell patterns.    fnmatch(FILENAME, PATTERN) matches according to the local convention.  fnmatchcase(FILENAME, PATTERN) always takes case in account.    The functions operate by translating the pattern into a regular  expression.  They cache the compiled regular expressions for speed.    The function translate(PATTERN) returns a regular expression  corresponding to PATTERN.  (It does not compile it.)  """  import os  import posixpath  import re  import functools    __all__ = ["filter", "fnmatch", "fnmatchcase", "translate"]    def fnmatch(name, pat):      """Test whether FILENAME matches PATTERN.        Patterns are Unix shell style:        *       matches everything      ?       matches any single character      [seq]   matches any character in seq      [!seq]  matches any char not in seq        An initial period in FILENAME is not special.      Both FILENAME and PATTERN are first case-normalized      if the operating system requires it.      If you don't want this, use fnmatchcase(FILENAME, PATTERN).      """      name = os.path.normcase(name)      pat = os.path.normcase(pat)      return fnmatchcase(name, pat)    @functools.lru_cache(maxsize=256, typed=True)  def _compile_pattern(pat):      if isinstance(pat, bytes):          pat_str = str(pat, 'ISO-8859-1')          res_str = translate(pat_str)          res = bytes(res_str, 'ISO-8859-1')      else:          res = translate(pat)      return re.compile(res).match    def filter(names, pat):      """Return the subset of the list NAMES that match PAT."""      result = []      pat = os.path.normcase(pat)      match = _compile_pattern(pat)      if os.path is posixpath:          # normcase on posix is NOP. Optimize it away from the loop.          for name in names:              if match(name):                  result.append(name)      else:          for name in names:              if match(os.path.normcase(name)):                  result.append(name)      return result    def fnmatchcase(name, pat):      """Test whether FILENAME matches PATTERN, including case.        This is a version of fnmatch() which doesn't case-normalize      its arguments.      """      match = _compile_pattern(pat)      return match(name) is not None      def translate(pat):      """Translate a shell PATTERN to a regular expression.        There is no way to quote meta-characters.      """        i, n = 0, len(pat)      res = ''      while i < n:          c = pat[i]          i = i+1          if c == '*':              res = res + '.*'          elif c == '?':              res = res + '.'          elif c == '[':              j = i              if j < n and pat[j] == '!':                  j = j+1              if j < n and pat[j] == ']':                  j = j+1              while j < n and pat[j] != ']':                  j = j+1              if j >= n:                  res = res + '\['              else:                  stuff = pat[i:j].replace('\','\\')                  i = j+1                  if stuff[0] == '!':                      stuff = '^' + stuff[1:]                  elif stuff[0] == '^':                      stuff = '\' + stuff                  res = '%s[%s]' % (res, stuff)          else:              res = res + re.escape(c)      return r'(?s:%s)Z' % res  

fnmatch的中的5个函数["filter", "fnmatch", "fnmatchcase", "translate"]

  • filter 返回列表形式的结果
def gen_find(filepat, top):      """      查找符合Shell正则匹配的目录树下的所有文件名      :param filepat: shell正则      :param top: 目录路径      :return: 文件绝对路径生成器      """      for path, _, filenames in os.walk(top):          for file in fnmatch.filter(filenames, filepat):              yield os.path.join(path, file)  
  • fnmatch
# 列出元组中所有的python文件  pyfiles = [py for py in ('restart.py', 'index.php', 'file.txt') if fnmatch(py, '*.py')]  # 字符串的 startswith() 和 endswith() 方法对于过滤一个目录的内容也是很有用的  
  • fnmatchcase 区分大小写的文件匹配
# 这两个函数通常会被忽略的一个特性是在处理非文件名的字符串时候它们也是很有用的。 比如,假设你有一个街道地址的列表数据  address = [      '5412 N CLARK ST',      '1060 W ADDISON ST',      '1039 W GRANVILLE AVE',      '2122 N CLARK ST',      '4802 N BROADWAY',  ]  print([addr for addr in address if fnmatchcase(addr, '* ST')])  
  • translate 这个似乎很少有人用到,前面说了fnmatch是Unix shell匹配风格,可以使用translate将其转换为正则表达式,举个栗子
shell_match = 'Celery_?*.py'  print(translate(shell_match))  # 输出结果:(?s:Celery_..*.py)Z  

Celery_..*.py就是正则表达式的写法。

原文出处:segmentfault -> https://segmentfault.com/a/1190000017198824

本站所发布的一切资源仅限用于学习和研究目的;不得将上述内容用于商业或者非法用途,否则,一切后果请用户自负。本站信息来自网络,版权争议与本站无关。您必须在下载后的24个小时之内,从您的电脑中彻底删除上述内容。如果您喜欢该程序,请支持正版软件,购买注册,得到更好的正版服务。如果侵犯你的利益,请发送邮箱到 [email protected],我们会很快的为您处理。
超哥软件库 » Python中fnmatch模块的使用