蓝鲸运维SaaS开发实战公开课第二课内容 Python基础

Python的特点

动态强类型语言
通用型语言
解释型语言
优雅、明确、简单

面向对象的本质就是对代码就行抽象，将代码进行封装。

基本数据类型

int

Python里面，无论多大，都是整型。

1 2	`>>> type(111111111111111111111111) <class 'int'>`

float

保留指定精度

>>> a = "{:.2f}".format(3.1415)  # 字符串格式化
>>> a
'3.14'
>>> round(3.1415926, 2)  # 使用内建函数round
3.14

python的除以 / 的结果默认是浮点数。整除必须用 //，且向下取整。

>>> 15 / 3
5.0
>>> 15 // 3
5

>>> 20 / 3
6.666666666666667
>>> 20 // 3
6

正无穷与负无穷

>>> float("inf") > (2**64)
True
>>> float("-inf") < (-2**64)
True

string

字符串截取

>>> s = 'iLovePython'
>>> s[-6:]
'Python'
>>> s[::-1]    # 逆序
'nohtyPevoLi'
>>> s[::2]     # 只要偶数位
'ioeyhn'
>>> s[1::2]    # 只要奇数位
'LvPto'

常用内建函数

# 匹配字符串开头字符
>>> s = 'iLovePython'
>>> s.startswith("iLove")
True

# 匹配字符串结尾字符
>>> s.endswith("Python")
True

# 搜索特定字母
>>> s = 'iLovePythony'
>>> s.find('y')  # find找不到时会抛出异常
6
>>> s.index('y')  # index找不到时会返回-1
6

# 统计特定字母出现次数
>>> s.count('y')
2

# 去除空格
>>> s = '  iLovePythony'
>>> s.strip()
'iLovePythony'

# 按照空格分段
>>> s = 'i  Love     Pythony'
>>> s.split()
['i', 'Love', 'Pythony']

# 拼接字符串
>>> l = ['i', 'Love', 'Pythony']
>>> "-".join(l)
'i-Love-Pythony'

字符串格式化

# method 1: using %s
>>> _str = "I\'m form %s, %s is the capital of %s" % ("Xi'an", "Xi'an", "ShanXi")
>>> _str
"I'm form Xi'an, Xi'an is the capital of ShanXi"

# method 2: using {}
>>> _str = "I\'m form {}, {} is the capital of {}".format("Xi'an", "Xi'an", "ShanXi")
>>> _str
"I'm form Xi'an, Xi'an is the capital of ShanXi"

# method 3: using {keyword}
>>> _str = "I\'m form {city}, {city} is the capital of {province}".format(city="Xi'an", province="ShanXi")
>>> _str
"I'm form Xi'an, Xi'an is the capital of ShanXi"

list

切片，参考string方法，是类似的。

>>> _list = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> _list[::-1]
[10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
>>> _list[2:5]  # 2、3、4，包含2，不包含5。
[2, 3, 4]
>>> _list[:]  # 取全部
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> _list[-6:]  # 从倒数第6个一直到最后
[5, 6, 7, 8, 9, 10]

单点操作

>>> _list = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> _list[0] = 100  # 单点更新
>>> _list
[100, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> del _list[0]  # 单点删除
>>> _list
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

list追加

>>> _list = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> _list
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> _list.append(11)  # append()的参数，可以不用是可迭代对象。
>>> _list
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
>>> _list.extend([12])  # extend()的参数必须是可迭代对象 等效于 _list += [12]。
>>> _list
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]

计数

1
2
3

>>> _list = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1]
>>> _list.count(1)
2

弹出

>>> _list = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1]
>>> _list.pop()  # 默认弹出最后一个元素
1
>>> _list.pop(0)  # 也可以选择弹出第一个元素
0
# list底层实现是一个数组，当弹出最后一个元素时，是直接删除最后一个元素的引用，所以很快。
# 当弹出第一个元素时，是删除第一个元素的引用，然后逐个前移动，是一个线性级别的时间复杂度的操作。

从细节上看，Python中的列表是由对其它对象的引用组成的连续数组。指向这个数组的指针及其长度被保存在一个列表头结构中。这意味着，每次添加或删除一个元素时，由引用组成的数组需要重新分配。幸运的是，Python在创建这些数组时采用了指数分配（第 \(1\) 次内存不够时分配 \(2^1\)，第 \(2\) 次内存不够时分配\(2^2\)，以此类推...），所以并不是每次操作都需要改变数组的大小。但是，也因为这个原因添加或取出元素的平摊复杂度较低。关于list的底层实现，可以参看这篇文章列表的内部实现。

列表解析器

>>> a = ['apple', 'banana', 'orange']
>>> b = [list(a) for i in range(3)]  # 这里每次都用list()函数生成了一个新的list对象，是深拷贝
>>> b
[['apple', 'banana', 'orange'], ['apple', 'banana', 'orange'], ['apple', 'banana', 'orange']] 
>>> a[0] = 'aaa' 
>>> b
[['apple', 'banana', 'orange'], ['apple', 'banana', 'orange'], ['apple', 'banana', 'orange']]

dict

声明方法

>>> d1 = dict([('x', 1), ('y',  8)])
>>> d2 = {"x": 1, "y": 8}
>>> d1
{'x': 1, 'y': 8}
>>> d2
{'x': 1, 'y': 8}

能用实例对象做键吗？

>>> a = test(1)
>>> _d = {a: 1}
>>> _d[a]  # 示例对象是不能做dict的key的，因为dict的key必须得是不可变对象如int、float、tuple  
1
>>> a = test(2) 
>>> _d[a]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: <__main__.test object at 0x0000027886F52FA0>

不允许同一个键出现两次。创建时如果同一个键被赋值两次，后一个值会被记住。
1
2
3
>>> _dict = {'a': 1, 'a': 2} >>> _dict['a'] 2
键必须不可变，所以可以用数字，字符串或元组充当，注意，当元祖中有可变对象时，也不能作为键。
1
2
3
4
>>> _dict = {(1, [1, 2, 3]): 3} Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unhashable type: 'list'
常见的字典内建方法

>>> d1 = dict([('x', 1), ('y',  8)])
>>> d1
{'x': 1, 'y': 8}

# get() 可以设置默认值
>>> d1.get('x', 0)
1
>>> d1.get('z', 0)
0

# keys()/values() /items()
>>> d1.keys()
dict_keys(['x', 'y'])
>>> d1.values()
dict_values([1, 8])
>>> d1.items()
dict_items([('x', 1), ('y', 8)])

# pop()
>>> d1.pop('x')
1
>>> d1
{'y': 8}

# update()
>>> d2 = {"a": 2, "b": 3}
>>> d2.update(d1)
>>> d2
{'a': 2, 'b': 3, 'y': 8}

#clear()
>>> d2.clear()
>>> d2
{}

set

set去重

>>> _list = [1, 2, 3, 2, 3]
>>> _list = list(set(_list))
>>> _list
[1, 2, 3]

set 交、并、补、对称差

>>> s1 = set(range(3))
>>> s2 = set(range(2, 5))
>>> s1
{0, 1, 2}
>>> s2
{2, 3, 4}
>>> s1 & s2  # 交集
{2}
>>> s1 | s2  # 并集
{0, 1, 2, 3, 4}
>>> s1 - s2  # 差集（相对补集）属于s1但是不属于s2
{0, 1}
>>> s1 ^ s2  # 对称差集
{0, 1, 3, 4}
>>> (s1 | s2) - (s1 & s2)  # 对称差集
{0, 1, 3, 4}

PS：相对补集和绝对补集
相对补集：若A和B 是集合，则A 在B 中的相对补集是这样一个集合：其元素属于B但不属于A，B - A = { x| x∈B且x∉A}。绝对补集：若给定全集U，有A⊆U，则A在U中的相对补集称为A的绝对补集（或简称补集），写作∁UA。 PPS：dict和set在底层实现上，都是hash值，所以set里面也只能放不可变对象，换而言之，可以做dict的key，就可以放到set中，否则不能放到set中。

file

>>> f
<_io.TextIOWrapper name='C:\\Users\\Tommy\\Dropbox\\LeetCode\\md5.txt' mode='r' encoding='cp936'>
>>> f.name
'C:\\Users\\Tommy\\Dropbox\\LeetCode\\md5.txt'
>>> f.mode  # r(default)/w/a/x
'r'
>>> f.read()
'hello word!\nhello peopel!'
>>> f.read(100)  # 执行了read之后再执行read(100)就没有输出了，这是因为读指针已经指向了文件的末尾。
''

# 重新打开
>>> f.close()
>>> f = open(r"C:\Users\Tommy\Dropbox\LeetCode\md5.txt", 'r')
>>> f.read(1)  # 只有重新打开才能再次读取
'h'
>>> f.read(1)  # 一个英文字符一个字节
'e'
>>> f.read(1)
'l'
>>> f.read(1)
'l'
>>> f.read(1)
'o'
>>> f.read(1)
' '
>>> f.read(5)
'word!'

# 读中文需要设置encoding='utf-8'
>>> f = open(r"C:\Users\Tommy\Dropbox\LeetCode\md5.txt", 'r', encoding='utf-8')
>>> f.read(1)  # 一个中文字符一个字节
'你'
>>> f.read(1)
'好'

# 一行一行的读 readline()
>>> f = open(r"C:\Users\Tommy\Dropbox\LeetCode\md5.txt", 'r')
>>> f.readline()
'hello world\n'
>>> f.readline()
'hello people'

# 一次性读完所有行 readlines()
>>> f = open(r"C:\Users\Tommy\Dropbox\LeetCode\md5.txt", 'r')
>>> f.readlines()
['hello world\n', 'hello people']

# 调整读取指针位置 seek(position)
>>> f = open(r"C:\Users\Tommy\Dropbox\LeetCode\md5.txt", 'r')
>>> f.seek(1)
1
>>> f.readline()
'ello world\n'

# 写入文件
>>> f = open(r"C:\Users\Tommy\Dropbox\LeetCode\md5.txt", 'a')
>>> f.mode
'a'
>>> f.write("\nHello Animal")
13
>>> f.flush()  # 将内存缓存写入到磁盘
>>> f.close()  # 关闭文件
>>> f = open(r"C:\Users\Tommy\Dropbox\LeetCode\md5.txt", 'r')
>>> f.readlines()
['hello world\n', 'hello people\n', 'Hello Animal']

标准的读写操作，建议使用这种方法进行读写操作，with代码块执行结束之后，会自动执行 f.flush() 和 f.close()。

# 读操作
with open(r"C:\Users\Tommy\Dropbox\LeetCode\md5.txt") as f:
    for line in f:
        print(line)
        
# output
hello world
hello people
Hello Animal

# 写操作
content = ['world hello\n', 'people hello\n', 'animal hello']
with open(r"C:\Users\Tommy\Dropbox\LeetCode\md5.txt", "w") as f:
    for line in content:
        f.write(line)

# 写入结果
world hello
people hello
animal hello

常用内建函数

print

print 可以用end指定输出结尾所用的字符，这样就不会强制换行了。

>>> a = 'apple'
>>>
>>> for i in a:
...     print(i, end='-')
...  
a-p-p-l-e-

enumeratev

enumerate(iterable object, start_index)可以通过设置 start_index ，决定从哪一个 index开始遍历。

>>> for i,v in enumerate(a, 1):
...     print((i, v	), end=' ')
...
(1, 'a') (2, 'p') (3, 'p') (4, 'l') (5, 'e')

需要记住的内建函数

课堂作业

res = []

with open(r"C:\Users\Tommy\Dropbox\蓝鲸运维\第二节课-Python3基础V1.1\shoes.txt") as f:
    for i in f.readlines():
        segemt = i.strip("\n").split()
        temp_dict = {"brand": ' '.join(segemt[:-2]), "color": segemt[-2], "size": int(segemt[-1])}
        print(temp_dict)
        res.append(temp_dict)

# 开始排序
def return_color(x):
    return x["color"]
res = sorted(res, key=return_color)

# 将结果输出
with open(r"C:\Users\Tommy\Dropbox\蓝鲸运维\第二节课-Python3基础V1.1\results.txt", "w") as f:
    for i in res:
        f.write("{}\t{}\t{}\n".format(i["brand"], i["color"], i["size"]))

print("finished!")

控制流

while-else

当循环完整结束时，执行else 下面的语句，有时可以用于代替标识变量flag的作用，比如当我们需要判断一个 list内的元素是否都为偶数，并打印结果时，使用flag：

a_list = list("abcdefg")
index = 0
not_found_flag = True
while index < len(a_list):
    print(a_list[index], end=" ")
    if a_list[index] == 'x':
        not_found_flag = False
        print('| x found in list')
        break
    index += 1
if not_found_flag:
    print("| x not found in list")

# output
a b c d e f g | x not found in list

使用 while-else 不用 flag

a_list = list("abcdefg")
index = 0
while index < len(a_list):
    print(a_list[index], end=" ")
    if a_list[index] == 'x':
        print('| x found in list')
        break
    index += 1
else:
    print("| x not found in list")

# output
a b c d e f g | x not found in list

课堂作业答案：http://www.pythonchallenge.com/pc/def/map.html

s = 'g fmnc wms bgblr rpylqjyrc gr zw fylb. rfyrq ufyr amknsrcpq ypc dmp. bmgle gr gl zw fylb gq glcddgagclr ylb rfyr\'q ufw rfgq rcvr gq qm jmle. sqgle qrpgle.kyicrpylq() gq pcamkkclbcb. lmu ynnjw ml rfc spj.'

hash_map = {
    "a": "c",
    "b": "d",
    "c": "e",
    "d": "f",
    "e": "g",
    "f": "h",
    "g": "i",
    "h": "j",
    "i": "k",
    "j": "l",
    "k": "m",
    "l": "n",
    "m": "o",
    "n": "p",
    "o": "q",
    "p": "r",
    "q": "s",
    "r": "t",
    "s": "u",
    "t": "v",
    "u": "w",
    "v": "x",
    "w": "y",
    "x": "z",
    "y": "a",
    "z": "b"
}

s = list(s)
for i in range(len(s)):
    if 97 <= ord(s[i]) <= 122:
        s[i] = hash_map[s[i]]

s = ''.join(s)
print(s)


# output
i hope you didnt translate it by hand. thats what computers are for. doing it in by hand is inefficient and that's why this text is so long. using string.maketrans() is recommended. now apply on the url.

异常处理

try:
    # <等待检测的代码>
except expression as identifier:
    # <处理异常的语句>
except expression as identifier:
    # <处理异常语句>
else:
    # <没有产生异常执行的语句>
finally:
    # <始终会执行的语句>

函数调用

可变参数

1	`def func(a, args, *kwargs):`

可变参数调用，注意以下两种方式，效果是一样的。

>>> from datetime import datetime
>>> datetime(year=1997, month=7, day=1)
datetime.datetime(1997, 7, 1, 0, 0)
>>> kwargs = {"year": 1997, "month": 7, "day": 1}
>>> datetime(**kwargs)
datetime.datetime(1997, 7, 1, 0, 0)

常用内置模块

datetime是Python处理日期和时间的标准库。
collections是Python内建的一个集合模块，提供了许多有用的集合类。
Base64是一种用64个字符来表示任意二进制数据的方法。
hashlib提供了常见的摘要算法，如MD5，SHA1。
itertools提供了非常有用的用于操作迭代对象的函数。
urllib提供了一系列用于操作URL的功能。
HTMLParser 来非常方便地解析HTML。

Python os 模块，用于提供系统级别的操作

os.getcwd 获取当前工作目录，即当前python脚本工作的目录路径
os.chdir("dirname") 改变当前脚本工作目录；相当于shell下cd
os.curdir() 返回当前目录
os.pardir() 获取当前目录的父目录字符串名
os.removedirs('dirname1) 若目录为空，则删除，并递归到上一级目录，如若也为空，则删除，依此类推
os.mkdir('dirname') 生成单级目录；相当于shell中mkdir dirname
os.rmdir('dirname') 删除单级空目录，若目录不为空则无法删除，报错；相当于shell中rmdir dirname
os.remove() 删除一个文件
os.rename("oldname"，"newname") 重命名文件/目录
os.sep 输出操作系统特定的路径分隔符，win下为""，Linux下为"/”
os.linesep 输出当前平台使用的行终止符，win下为""，Linux下为""

Python sys模块提供了一系列有关Python运行环境的变量和函数

sys.argv 命令行参数List，第一个元素是程序本身路径
sys.exit(n) 退出程序，正常退出时exit(0)
sys.version 获取Python解释程序的版本信息
sys.maxint 最大的Int值
sys.path 返回模块的搜索路径，初始化时使用PYTHONPATH环境变量的值
sys.platform 返回操作系统平台名称
sys.stdout.write(''please')
val=sys.stdin.readline0[:-1]

常用第三方Python模块

PIL，python图像处理。
Paramiko，ssh python库。
Numpy，科学计算。
Matplotlib，画图。
Scrapy，爬虫。
Selenium，浏览器自动化测试工具selenium的python接口。
Gevent，高并发的网络性能库。
twisted，基于事件驱动的网络引擎框架。
sh，强大的系统系统管理神器。
Jinja2，模板引擎

系统运维 python

本博客所有文章除特别声明外，均采用 CC BY-SA 4.0 协议，转载请注明出处！

蓝鲸运维SaaS开发实战公开课第三课：前端基础上一篇

蓝鲸运维SaaS开发实战公开课第一课：企业级PaaS解决方案下一篇

蓝鲸运维SaaS开发实战公开课第二课：Python基础

Python的特点

基本数据类型

int

float

string

list

dict

set

file

常用内建函数

控制流

while-else

异常处理

函数调用

常用内置模块