且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

在Django查询中删除重复项

更新时间:2023-02-13 16:22:43

此查询不会为您提供重复-即,它将给出您通过电子邮件对数据库中的所有行进行排序。

This query will not give you duplicates - ie, it will give you all the rows in the database, ordered by email.

但是,我认为您的意思是数据库中有重复的数据。在这里添加 distinct()并没有帮助,因为即使您只有一个字段,您也会有一个自动的 id 字段-因此id + email的组合不是唯一的。

However, I presume what you mean is that you have duplicate data within your database. Adding distinct() here won't help, because even if you have only one field, you also have an automatic id field - so the combination of id+email is not unique.

假设您只需要一个字段,电子邮件地址 ,重复数据删除,您可以执行以下操作:

Assuming you only need one field, email_address, de-duplicated, you can do this:

email_list = Email.objects.values_list('email', flat=True).distinct()

但是,您应该真正解决根本问题,并从您的文件中删除重复的数据数据库。

However, you should really fix the root problem, and remove the duplicate data from your database.

示例,通过电子邮件字段删除重复的电子邮件:

Example, deleting duplicate Emails by email field:

for email in Email.objects.values_list('email', flat=True).distinct():
    Email.objects.filter(pk__in=Email.objects.filter(email=email).values_list('id', flat=True)[1:]).delete()

或按名称预订:

for name in Book.objects.values_list('name', flat=True).distinct(): 
    Book.objects.filter(pk__in=Artwork.objects.filter(name=name).values_list('id', flat=True)[3:]).delete()