没找到原因, 大佬们帮忙看看
项目使用 Django
+ uwsgi
+ Nginx
uwsgi
[uwsgi] pythOnpath=/usr/local/server chdir=/home/server env=DJANGO_SETTINGS_MODULE=conf.settings module=server.wsgi master=True pidfile=logs/server.pid vacuum=True max-requests=1000 enable-threads=true processes = 4 threads=8 listen=1024 daemOnize=logs/wsgi.log http=0.0.0.0:16020 buffer-size=32000
有一个从服务端获取 excel 文件的请求,请求全量 40 多万条数
响应时间过长, 然后浏览器得到Nginx 504
但是一个子进程的内存一直彪升
top -p 866 PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 866 soe 20 0 7059m 5.3g 5740 S 100.8 33.9 4:17.34 uwsgi --ini /home/smb/work/soe_server
基本代码
from openpyxl.writer.excel import save_virtual_workbook from django.http import HttpResponse ... class ExcelTableObj(object): def __init__(self, file_name=None): if file_name: self.file_name = file_name self.wb = load_workbook(file_name) else: self.wb = Workbook() def create_new_sheet(self, title='Sheet1'): new_ws = self.wb.create_sheet(title=title) def write_to_sheet(self, sheetname, datas, filename): ws = self.wb[sheetname] for data in datas: ws.append(data) self.wb.save(filename) def update_sheet_name(self, sheetname): ws = self.wb.active ws.title = sheetname def append_data_to_sheet(self, sheetname, data): ws = self.wb[sheetname] ws.append(data) def save_file(self, file_name): self.wb.save(file_name) self.wb.close() def get_upload_file_data(self, name=None): if name: ws = self.wb.get_sheet_by_name(name) else: ws = self.wb.worksheets[0] rows = ws.max_row cols = ws.max_column file_data = [] fields = [] for i in range(1, cols+1): cell = ws.cell(row=1, column=i) if cell.value: fields.append(cell.value.lower().strip()) for row in range(2, rows + 1): row_data = {} for j in range(len(fields)): value = ws.cell(row=row, column=j+1).value if value: row_data[fields[j]] = str(value).strip() if row_data: file_data.append(row_data) return file_data def get_sheet_maxrow(self, name): ws = self.wb.get_sheet_by_name(name) rows = ws.max_row return rows def _get_download_data(datas): for data in queryset : ... item = [str(data.account_id), ILLEGAL_CHARACTERS_RE.sub(r'', data.account_name) if data.account_name else data.account_name, type, fb_aac_conf.FB_ACCOUNT_STATUS[data.account_status], data.submitter, data.submit_time, data.confirmor, data.confirm_time, fb_aac_conf.BATCH_STATUS[data.status], data.reason, data.entity_name, data.payment_name, data.sale, data.ae_note, urgent ] yield item queryset = MyModel.objects.filter(...) # about `450k` rows datas = _get_download_data(queryset) excel = ExcelTableObj() excel.update_sheet_name(sheetname) excel.append_data_to_sheet(sheetname, title) excel.write_to_sheet(sheetname, datas, filename) excel.save_file(filename) respOnse= HttpResponse(save_virtual_workbook(excel.wb), content_type='application/vnd.openxmlformats-officedocument.spreadsheetml.sheet') response['Content-Disposition'] = 'attachment; filename={}'.format(filename)
请大家帮忙分洗一下问题原因。感谢
![]() | 1 ch2 2021-03-19 17:49:01 +08:00 |
![]() | 2 BeautifulSoap 2021-03-19 17:52:56 +08:00 openpyxl 别说导出了,大点的 excel 读一下就要十来秒 |
3 chenqh 2021-03-19 18:04:02 +08:00 导出 excel 一般不是生成一个地址,然后前端自己去下载吗? |
![]() | 5 ryd994 2021-03-19 18:25:18 +08:00 via Android 为啥导出 xls,如果只是数据的话 CSV 不就行了 excel 可以直接打开 |
![]() | 6 vegetableChick OP @ch2 卡住可以理解, 但是为什么内存占用会一直往上升啊 |
![]() | 7 vegetableChick OP @BeautifulSoap 。。为什么内存会吃这么多 而且一直彪 |
![]() | 8 vegetableChick OP @no1xsyzy 为什么进程内存一致彪升啊 |
9 superrichman 2021-03-19 21:17:01 +08:00 这几十万的数据在内存里被你复制了不知道多少次,用完了也不主动去释放,openpyxl 也没开 write_only 模式,内存不炸有鬼了,自己 debug 一步一步看吧 还有 ``` def _get_download_data(datas): for data in queryset : ... queryset = MyModel.objects.filter(...) datas = _get_download_data(queryset) ``` 你确定函数是这么传参数的? |
![]() | 10 BeautifulSoap 2021-03-19 21:47:58 +08:00 不是 ls 的内容我还没注意到,作为犯过同样错误的人提醒下 lz,data 的复数形式依旧是 data,不是 datas |
![]() | 11 ericls 2021-03-19 21:54:16 +08:00 via iPhone @BeautifulSoap data 的单数是 datum. |
![]() | 12 BeautifulSoap 2021-03-19 22:17:39 +08:00 @ericls 语言是一直在动态变化的,datum 实际用得并不多,data 同时做单数已经成了英语世界一个大家都接受的用法了 |
![]() | 13 FindHao 2021-03-20 00:01:17 +08:00 via Android 我也有个辣鸡代码是这样的,最后的解决方案是暴力写了 crontab 重启自己的服务 |