分割单词
将一个标识符分割成若干单词存进列表,便于后续命名法的转换
先引入正则表达式包
至于如何分割单词看个人喜好,如以常见分隔符 “ ”、“_”、“-”、“/”、“\” 去分割
1 | re.split( '[ _\-/\\\\]+' , name)
|
还可以范围再广一点,拿除了数字和字母以外的所有字符去分割
1 | re.split( '[^0-9a-zA-Z]' , name)
|
那对于字母内部怎么分割呢?
综合考虑驼峰命名法、连续大写的缩写单词等,笔者根据经验一般会采用这种策略,连续比较三个字符,满足以下条件之一就分割:“小|大无”、“有|大小”、“小|大有”
是尾字符,是大写,倒数第二个字符是小写,在尾字符前分割,比如 'getA' 分割成 ['get','A']
是非首位的中间字符,是大写,前后至少有一个是小写,在该字符前分割,比如 'getJSONString' 分割成 ['get','JSON','String']
对于字母和数字结合的标识符,就比较难处理了
因为有的数字可以作为单词开头(比如 '3D'),有的又可以作为结尾(比如 'HTML5'),还有的字母数字交错(比如 'm3u8'),暂未想到通用的分割的好办法,根据个人需求实现就行了
综合以上几者的分割函数如下
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | def to_words(name):
words = []
word = ''
if ( len (name) < = 1 ):
words.append(name)
return words
name_parts = re.split( '[^0-9a-zA-Z]' , name)
for part in name_parts:
part_len = len (part)
word = ''
if not part:
continue
for index, char in enumerate (part):
if (index = = part_len - 1 ):
if (char.isupper() and part[index - 1 ].islower()):
if (word): words.append(word)
words.append(char)
word = ''
continue
elif (index ! = 0 and char.isupper()):
if ((part[index - 1 ].islower() and part[index + 1 ].isalpha()) or (part[index - 1 ].isalpha() and part[index + 1 ].islower())):
if (word): words.append(word)
word = ''
word + = char
if ( len (word) > 0 ): words.append(word)
return [word for word in words if word ! = '']
|
测试用例如下
1 2 3 4 | print (to_words( 'IDCard' ))
print (to_words( 'getJSONObject' ))
print (to_words( 'aaa@bbb.com' ))
print (to_words( 'D://documents/data.txt' ))
|
分割成全小写单词
1 2 3 | def to_lower_words(name):
words = to_words(name)
return [word.lower() for word in words]
|
分割成全大写单词
1 2 3 | def to_upper_words(name):
words = to_words(name)
return [word.upper() for word in words]
|
分割成首大写、其余小写单词
1 2 3 | def to_capital_words(name):
words = to_words(name)
return [word.capitalize() for word in words]
|
转中划线命名法
中划线命名法,也叫烤肉串命名法(kebab case),如 'kebab-case'
1 2 3 4 | def to_kebab_case(name):
words = to_lower_words(name)
to_kebab_case = '-' .join(words)
return to_kebab_case
|
转小蛇式命名法
小蛇式命名法,其实就是小写下划线命名法,也叫蛇式命名法(snake case),如 'snake_case'
1 2 3 4 | def to_snake_case(name):
words = to_lower_words(name)
snake_case_name = '_' .join(words)
return snake_case_name
|
转大蛇式命名法
大蛇式命名法,其实就是大写下划线命名法,也叫宏命名法(macro case),如 'MACRO_CASE'
1 2 3 4 | def to_macro_case(name):
words = to_upper_words(name)
snake_case_name = '_' .join(words)
return snake_case_name
|
转小驼峰命名法
小驼峰命名法,也叫驼峰命名法(camel case) ,如 'camelCase'
首单词首字母小写,后每个单词首字母大写
不使用连接符
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | def to_camel_case(name):
words = to_words(name)
camel_case_words = []
for word in words:
if len (word) < = 1 :
camel_case_words.append(word.upper())
else :
camel_case_words.append(word[ 0 ].upper() + word[ 1 :])
camel_case = ''.join(camel_case_words)
if len (camel_case) < = 1 :
camel_case = camel_case.lower()
else :
camel_case = ''.join(camel_case[ 0 ].lower() + camel_case[ 1 :])
return camel_case
|
转大驼峰命名法
大驼峰命名法,也叫帕斯卡命名法(pascal case) ,如 'PascalCase'
1 2 3 4 5 6 7 8 9 10 | def to_pascal_case(name):
words = to_words(name)
pascal_case_words = []
for word in words:
if len (word) < = 1 :
pascal_case_words.append(word.upper())
else :
pascal_case_words.append(word[ 0 ].upper() + word[ 1 :])
pascal_case = ''.join(pascal_case_words)
return pascal_case
|
以上就是Python怎么实现分割单词和转换命名的详细内容
发表评论:
◎欢迎参与讨论,请在这里发表您的看法、交流您的观点。