且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

复杂的MySql查询给出不正确的结果

更新时间:2022-10-14 23:01:25

  select * from(
select a.idArticle,a.content,max(`order`)max_order
from words_learned wl
连接词w on w.idwords = wl.idwords
加入文章a on a.idArticle = w.idArticle
其中wl.userId = 4
group by a.idArticle
)a
left join(
select count(*)unknown_count,w2.idArticle from words w2
left join words_learned wl2 on wl2.idwords = w2 .idwords
和wl2.userId = 4
其中wl2.idwords为null
group by w2.idArticle
)unknown_counts on unknown_counts.idArticle = a.idArticle
其中unknown_count为null或unknown_count< 10
order by max_order desc
limit 100

http://sqlfiddle.com/#!2/6944b/9



第一个派生表选择给定用户知道一个或多个单词的唯一文章以及这些单词的最大 order 值。最大顺序值用于对最终结果进行排序,以便包含高位字的文章首先出现。



第二个派生表计算给定用户的单词数,知道每篇文章。此表用于排除包含用户不知道的10个或更多字词的文章。


This is not a duplicate, you can find the previous question and its answer here. A deep testing on it proved the previous answer is incorrect. - Writing a Complex MySQL Query

I have 3 tables.

Table Words_Learned contains all the words known by a user, and the order in which the words were learned. It has 3 columns 1) word ID and 2)user id and 3) order in which the word was learned.

Table Article contains the articles. It has 3 columns 1) article ID, 2) unique word count and 3) article contents.

Table Words contains a list of all unique words contained in each article. It has 2 columns 1) word ID and 2) article ID

The database diagram is as below/

Now, using this database and using "only" mysql, I need to do the below work.

Given a user ID, it should get a list of all words known by this user, sorted in the revese order from which they were learned. In other words, the most recently learned words will be at the top of the list.

Let’s say that a query on a user ID shows that they’ve memorized the following 3 words, and we track the order in which they’ve learned the words. Octopus - 3 Dog - 2 Spoon - 1

First we get a list of all articles containing the word Octopus, and then do the calculation using table Words on just those articles. Calculation means if that article contains more than 10 words that do not appear in the user’s vocabulary list (pulled from table words_learned), then it is excluded from the listing.

Then, we do a query for all records that contain dog, but DO NOT contain "octopus"

Then, we do a query for all records that contain spoon, but DO NOT contain the words Octopus or Dog

And you keep doing this repetitive process until we’ve found 100 records that meet this criteria.

To achieve this process, I did the below (Please visit the SQLFiddle link to see the table structures, test data and my query)

http://sqlfiddle.com/#!2/48dae/1

In my query, you can see the generated results and they are invalid. But on a "Proper Query", the result should be,

Level 1
Level 1
Level 1
Level 2
Level 2
Level 2
Level 3
Level 3

Here is a phudocode for better understanding.

Do while articles found < 100
{
 for each ($X as known words, in order that those words were learned)
 {
  Select all articles that contain the word $X, where the 1) article has not been included in any previous loops, and 2)where the count of "unknown" words is less than 10. 

  Keep these articles in order. 
 }
}

Please help.

select * from (
    select a.idArticle, a.content, max(`order`) max_order
    from words_learned wl
    join words w on w.idwords = wl.idwords
    join article a on a.idArticle = w.idArticle
    where wl.userId = 4
    group by a.idArticle
) a
left join (
    select count(*) unknown_count, w2.idArticle from words w2
    left join words_learned wl2 on wl2.idwords = w2.idwords
    and wl2.userId =  4
    where wl2.idwords is null
    group by w2.idArticle
) unknown_counts on unknown_counts.idArticle = a.idArticle
where unknown_count is null or unknown_count < 10
order by max_order desc
limit 100

http://sqlfiddle.com/#!2/6944b/9

The first derived table selects unique articles a given user knows one or more words from as well as the maximum order value of those words. The maximum order value is used to sort the final results so that articles containing high order words appear first.

The second derived table counts the number of words a given user doesn't know for each article. This table is used to exclude any articles that contain 10 or more words the user doesn't know.