且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

Django模型 - SELECT DISTINCT(foo)FROM表太慢了

更新时间:2023-12-01 13:46:34

谢谢@solarissmoke指向 connection.queries



我期待看到

  SELECT DISTINCT refdate FROM myTable 

相反,我有

  SELECT DISTINCT refdate,itemIndex,itemType FROM myTable ORDER BY itemIndex,refdate,itemType。 

然后我看看在 models.py $ c $中定义的myTable c>。

  unique_together =(('nodeIndex','refdate','nodeType'),)
= ['nodeIndex','refdate','nodeType']

与默认排序或order_by的交互


通常你不会想要额外的列在结果中播放一部分,所以清除订购,或者至少确保它被限制


所以我试过 order_by()来刷新以前定义的顺序和瞧瞧!

  myTable.objects.values('refdate')。 order_by()。distinct()


I have a MySQL table with 13M rows. I can query the db directly as

SELECT DISTINCT(refdate) FROM myTable

The query takes 0.15 seconds and is great.

The equivalent table defined as a Django model and queried as

myTable.objects.values(`refdate`).distinct()

takes a very long time. Is it because there are too many items in the list before distinct(). How do I do this in a manner that doesn't bring everything down?

Thank you @solarissmoke for the pointer to connection.queries.

I was expecting to see

SELECT DISTINCT refdate FROM myTable

Instead, I got

SELECT DISTINCT refdate, itemIndex, itemType FROM myTable ORDER BY itemIndex, refdate, itemType. 

I then looked at myTable defined in models.py.

unique_together = (('nodeIndex', 'refdate', 'nodeType'), )
ordering = ['nodeIndex', 'refdate', 'nodeType']

From Interaction with default ordering or order_by

normally you won’t want extra columns playing a part in the result, so clear out the ordering, or at least make sure it’s restricted only to those fields you also select in a values() call.

So I tried order_by() to flush the previously defined ordering and voila!

myTable.objects.values('refdate').order_by().distinct()