我有一个 3000w 行的数据表,我需要在这个表上统计从某一年开始新参保的人数,原数据库是 ORACLE ,用的是
AND A.AAC001 NOT EXISTS (SELECT 1 FROM AC02_TEMP AS B WHERE A.AAC001 = B.AAC001 AND B.AAC030 < '2018-01-01 00:00:00') 的语法,在 clickhouse 上我试了 LEFT JOIN 和 NOT IN ,性能均不理想
SELECT COUNT(1) AS "新参保人数" FROM AC02_TEMP AS A WHERE A.AAB301 IN (SELECT AAB301 FROM AA26 WHERE AAA148 = '130800') AND A.AAE200 = '41' AND A.AAC031 = '1' AND A.AAC030 >= '2018-01-01 00:00:00' AND A.AAC001 NOT IN (SELECT B.AAC001 FROM AC02_TEMP AS B WHERE B.AAC030 < '2018-01-01 00:00:00'); 以下是 explain
CreatingSets (Create sets before main query execution) Expression ((Projection + Before ORDER BY)) Aggregating Expression (Before GROUP BY) ReadFromMergeTree (default.AC02_TEMP) Indexes: PrimaryKey Keys: AAC001 AAE200 " Condition: and((AAC001 notIn 18692488-element set), (AAE200 in ['41', '41']))" Parts: 2/2 Granules: 4821/4821 CreatingSet (Create set for subquery) Expression ((Projection + Before ORDER BY)) ReadFromMergeTree (default.AA26) Indexes: PrimaryKey Condition: true Parts: 1/1 Granules: 1/1 我是 clickhouse 新手,目前没什么头绪,求大佬帮助 0.0
