实际开发过程中，会因为 lgn 倍的时间复杂度优化而牺牲代码可读性吗

V2EX = way to explore

V2EX 是一个关于分享和探索的地方

已注册用户请登录

这是一个创建于 1822 天前的主题，其中的信息可能经有所发展或是发生改变。

1. 场景描述：

TLDR: 在 Draft.js 富文本框架中，根据文本的 type 可以将每一行文本划分为

 'unstyled', 'header-one', 'header-two', ..., 'header-six', 'atomic', // 其他类型暂不考虑

其中，在处理 unstyled 文本时，

如果仅包含内置行内样式（BOLD, ITALIC, ...），那么直接根据 inlineStyleRanges 进行操作
如果出现了 Decorator 实现的行内样式（比如 @某人），那么必须对 inlineStyleRanges 和 EntityRanges 先进行合并操作，即根据 offset 排序数组（升序）

已知 inlineStyleRanges 和 EntityRanges 都是 sorted.

根据对象的 offset 属性，对两个 sorted 的数组进行升序排序 （ Merge Sorted Arrays)

2. 样例

类型定义：

type InlineStyleRangesType = { offset: number, length: number, style: 'BOLD' | 'CODE' | 'ITALIC' | 'STRIKETHROUGH' | 'UNDERLINE', }[] type EntityRangesType = { offset: number, length: number, key: number, }[] type SortedArrayType = { offset: number, length: number, key?: number, style?: 'BOLD' | 'CODE' | 'ITALIC' | 'STRIKETHROUGH' | 'UNDERLINE', }[]

输入：

const a: InlineStyleRangesType = [ { offset: 1, length: 1, style: "BOLD" }, { offset: 9, length: 2, style: "ITALIC" }, ] const b: EntityRangesType = [ { offset: 4, length: 1, key: 0 }, { offset: 12, length: 1, key: 1 }, ]

输出：

[ { offset: 1, length: 1, style: "BOLD" }, { offset: 4, length: 1, key: 0 }, { offset: 9, length: 2, style: "ITALIC" }, { offset: 12, length: 1, key: 1 }, ]

3. 解法

本人解法

// version 1 function mergeSortedArrays( inlineStyleRanges: InlineStyleRangesType, entityRanges: EntityRangesType ): SortedArrayType { const ranges = [] inlineStyleRanges.forEach((range) => ranges.push(range)) entityRanges.forEach((range) => ranges.push(range)) ranges.sort((a, b) => a.offset - b.offset) return ranges }

Interview Cake 解法

// version 2 function mergeSortedArrays( inlineStyleRanges: InlineStyleRangesType, EntityRanges: EntityRangesType ): SortedArrayType { const ranges = (inlineStyleRanges as SortedArrayType).concat(EntityRanges) ranges.sort((a, b) => a.offset - b.offset) return ranges }

// version 3 function mergeSortedArrays( inlineStyleRanges: InlineStyleRangesType, EntityRanges: EntityRangesType ): SortedArrayType { const ranges = [] let currentIndexEntity = 0; let currentIndexInlineStyle = 0; let currentIndexMerged = 0; while (currentIndexMerged < (inlineStyleRanges.length + EntityRanges.length)) { const isInlineStyleRangesExhausted = currentIndexInlineStyle >= inlineStyleRanges.length; const isEntityRangesExhausted = currentIndexEntity >= EntityRanges.length; // Case: next comes from inlineStyleRanges // inlineStyleRanges must not be exhausted, and EITHER: // 1) EntityRanges IS exhausted, or // 2) The current element in inlineStyleRanges is less // than the current element in EntityRanges if (!isInlineStyleRangesExhausted && (isEntityRangesExhausted || (inlineStyleRanges[currentIndexInlineStyle].offset < EntityRanges[currentIndexEntity].offset) )) { ranges[currentIndexMerged] = inlineStyleRanges[currentIndexInlineStyle]; currentIndexInlineStyle++; // Case: next comes from EntityRanges } else { ranges[currentIndexMerged] = EntityRanges[currentIndexEntity]; currentIndexEntity++; } currentIndexMerged++; } return ranges }

4. 算法分析

	version 1	version 2	version 3
是否利用了 sorted 特性	否	否	是
时间复杂度 *	O(nlgn)	O(nlgn)	O(n)
空间复杂度	O(n)	O(n)	O(n)
代码可读性	★★★★★	★★★★	★

* version 2 其实要优于 version 1 （常数倍）

25 条回复 2020-10-22 12:55:52 +08:00

YadongZhang

2020-10-21 16:48:52 +08:00

v2ex 似乎不支持 ts 语法高亮

jatai

2020-10-21 18:06:33 +08:00 via Android

实际开发过程中, 即使没有任何技术难度, 也要牺牲代码的可读性, 为了显得不可替代

raaaaaar

2020-10-21 18:08:08 +08:00 via Android

代码可读性真的和性能冲突吗？我有点疑惑，看那些经常炫技的代码，代码规范做好，命名，封装，注释，文档这么一套下来，真的会可读性低？

Leonard

2020-10-21 18:09:12 +08:00

注释写得好就行

gfreezy

2020-10-21 18:12:33 +08:00

一行也就 1000 个字符串顶天了吧。啥排序在性能上都没啥区别吧。O(n^2) 在实际使用中都没啥区别吧？

如果确实追求性能，可以单独写个通用 mergesort 函数，贴个算法文档链接，再复杂的实现都没关系，一般都直接读文档就好了，没必要读代码。

dreamapple

2020-10-21 18:24:41 +08:00 via Android

脚本语言直接调用内置 sort 和手写实现 merge 真的不知道哪个快，profile 一下说不定差别不大。实际项目亿级别的我选择 mapreduce

YadongZhang

2020-10-21 18:26:21 +08:00 via Android

@gfreezy #5 好像是这样的，每行文本长度有限

xumng123

2020-10-21 20:45:09 +08:00 via iPhone

不是实时系统或者上百万的并发量，没啥区别

liyunlong41

2020-10-21 20:50:55 +08:00 via iPhone

突然想到力扣链表找环的问题，有快慢指针和哈希表两种做法，两者时间复杂度相同但是前者省内存，可是前者难于理解和维护，后者简单易懂，假如实际业务上有类似地方，我可能会用哈希表的做法。

anguiao

2020-10-21 21:02:02 +08:00 via Android

如果是写偏底层的类库，应该考虑一下性能问题。
如果只是业务代码的话，在没有碰到性能瓶颈之前，个人觉得还是可读性比较重要。

ipwx

2020-10-21 21:35:04 +08:00

不是，现在前端程序员都这样了嘛，version 3 也叫不可读？

Sasasu

2020-10-21 22:32:29 +08:00

version 3 就是刻意写的很难懂，cpp 标准库版本

template<class InputIt1, class InputIt2, class OutputIt>
OutputIt merge(InputIt1 first1, InputIt1 last1,
InputIt2 first2, InputIt2 last2,
OutputIt d_first)
{
for (; first1 != last1; ++d_first) {
if (first2 == last2) {
return std::copy(first1, last1, d_first);
}
if (*first2 < *first1) {
*d_first = *first2;
++first2;
} else {
*d_first = *first1;
++first1;
}
}
return std::copy(first2, last2, d_first);
}

Sasasu

2020-10-21 22:36:44 +08:00

第一种就是刻意写的难懂，刻意使用下标，合并一个数组用光的条件到主循环里，合并判断条件到控制里

```
template<class InputIt1, class InputIt2,
class OutputIt, class Compare>
OutputIt merge(InputIt1 first1, InputIt1 last1,
InputIt2 first2, InputIt2 last2,
OutputIt d_first, Compare comp)
{
for (; first1 != last1; ++d_first) {
if (first2 == last2) {
return std::copy(first1, last1, d_first);
}
if (comp(*first2, *first1)) {
*d_first = *first2;
++first2;
} else {
*d_first = *first1;
++first1;
}
}
return std::copy(first2, last2, d_first);
}
```

ref https://en.cppreference.com/w/cpp/algorithm/merge

beidounanxizi

2020-10-22 00:54:37 +08:00

怎么说吧 premature optimize 是不可取的，lgN 也要看基数 N 啊 N 100 内有区别嘛？

YadongZhang

2020-10-22 01:02:49 +08:00 via Android

@beidounanxizi #14 复杂度说的是 asymptotic analysis

YadongZhang

2020-10-22 01:09:59 +08:00 via Android

这个函数会在另一个主函数里调用，而这个主函数是个 O(n^2) 的算法实现，优化的部分在这儿

YadongZhang

2020-10-22 01:22:51 +08:00 via Android

@ipwx #11 edge cases 的考量

islxyqwe

2020-10-22 08:17:49 +08:00

建一个 O(n)的<T>(a:T[],b:T[],cmp:(x:T,y:T)=>number):T[]然后调用

gimp

2020-10-22 08:25:17 +08:00

如何编写无法维护的代码，让自己稳拿铁饭碗 ;-)

https://coderlmn.github.io/frontEndCourse/unmaintainable.html

Nugine0

2020-10-22 09:07:18 +08:00 via Android

合并排序数组不是基础算法吗？应该有现成的轮子。
如果没有轮子，函数上面注释写清楚，加几个测试，没人会关心里面代码可不可读。

bleepbloop

2020-10-22 09:57:57 +08:00

工作好几年了，从没遇到数据量大到需要刻意优化算法的地步，可能我太菜了吧

py2ex

2020-10-22 10:28:44 +08:00

以为 lgn 是什么新词，原来是 log N

YadongZhang

2020-10-22 11:17:19 +08:00

@py2ex #22

CLRS 3e, Chap.1, Sec.3.2, P57

By equation (3.15) 换底公式, changing the base of a logarithm from one constant to another changes the value of the logarithm by only aconstant factor, and so we shall often use the notation “lg n” when we don’t care about constant factors, such as in O-notation.

shadeofgod

2020-10-22 12:55:07 +08:00

合并两个有序数组的也可以很容易写出兼顾可读性的写法吧？

shadeofgod

2020-10-22 12:55:52 +08:00