且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

在SQL Server 2008中查询大表

更新时间:2022-11-27 11:34:41

1.使用临时表

在您感兴趣的数据子集(行和列)上创建临时表.临时表应该比原始源表小得多,并且可以轻松地建立索引(如果需要).

1. Use temporary tables

Create temporary table on subset (rows and columns) of data you are interested in. Temporary table should be much smaller that original source table and can be indexed easily (if needed).

要创建临时表,您可以使用以下代码(未经测试):

To create temporary table you can use code (not tested) like:

-- copy records from last month to temporary table
INSERT INTO
   #my_temporary_table
SELECT
    *
FROM
    er101_acct_order_dtl WITH (NOLOCK)
WHERE 
    er101_upd_date_iso > DATEADD(month, -1, GETDATE())

-- run other queries on temporary table (which can be indexed)
SELECT TOP 100
    * 
FROM 
    #my_temporary_table 
ORDER BY 
    er101_upd_date_iso DESC

优点:

易于处理任何数据子集. 易于管理-这是临时的并且是桌子. 不会像视图那样影响整体系统性能. 临时表可以建立索引. 缺点:

Easy to do for any subset of data. Easy to manage -- it's temporary and it's table. Doesn't affect overall system performance like view. Temporary table can be indexed. Cons:

它是数据的快照-但这可能对即席查询就足够了.

It's snapshot of data -- but probably this is good enough for ad-hoc queries.

类似于上面,但是创建视图而不是临时表.

Similar to above, but create views instead of temporary tables.

您可以在您感兴趣的数据子集上创建视图或索引视图,并在视图上运行查询-该视图应仅包含比整个表小得多的有趣数据子集.

You can create views or indexed views on subset of data you are interested in and run queries on view -- which should contain only interesting subset of data much smaller then the whole table.

优点:

容易做. 它是最新的源数据. 缺点:

Easy to do. It's up to date with source data. Cons:

仅适用于已定义的数据子集. 对于具有高更新率的大型表可能效率不高. 不太容易管理. 会影响整体系统性能. 选择所有列 在大表上运行star查询(SELECT * FROM)并不是一件好事...

Possible only for defined subset of data. Could be inefficient for large tables with high rate of updates. Not so easy to manage. Can affect overall system performance. Selecting all columns Running star query (SELECT * FROM) on big table is not good thing...

如果您有大列(如长字符串),则需要花费大量时间从磁盘读取它们并通过网络传递.

If you have large columns (like long strings) it takes a lot of time to read them from disk and pass by network.

我会尝试将*替换为您真正需要的列名.

I would try to replace * with column names which you really need.

或者,如果您需要所有列,请尝试将查询重写为:

Or, if you need all columns try to rewrite query to something like:

;WITH recs AS (
    SELECT TOP 100 
        id as rec_id -- select primary key only
    FROM 
        er101_acct_order_dtl 
    ORDER BY 
        er101_upd_date_iso DESC
)
SELECT
    *
FROM
    er101_acct_order_dtl
WHERE 
    id = rec.rec_id
ORDER BY 
    er101_upd_date_iso DESC 

脏读

可以加快查询速度的最后一件事是允许使用表提示WITH(NOLOCK)进行脏读.

Dirty reads

Last thing which could speed up the query is allowing dirty reads with table hint WITH (NOLOCK).

代替提示,您可以将事务隔离级别设置为未提交:

Instead of hint you can set transaction isolation level to read uncommited:

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED