C# ADO.Net DataTables 有索引吗?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1109550/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-06 08:27:04  来源:igfitidea点击:

Do ADO.Net DataTables have indexes?

c#.netsql-serverado.netdatatable

提问by George2

I am using VSTS 2008 + C# + .Net 3.5 + SQL Server 2008 + ADO.Net. If I load a table from a database by using a DataTable of ADO.Net, and in the database table, I defined a couple of indexes on the table. My question is, whether on the ADO.Net DataTable, there is related index (the same as the indexes I created on physical database table) to improve certain operation performance on DataTable?

我使用的是 VSTS 2008 + C# + .Net 3.5 + SQL Server 2008 + ADO.Net。如果我使用 ADO.Net 的 DataTable 从数据库中加载一个表,并且在数据库表中,我在表上定义了几个索引。我的问题是,ADO.Net DataTable上是否有相关的索引(和我在物理数据库表上创建的索引一样)来提高对DataTable的某些操作性能?

thanks in advance, George

提前致谢,乔治

采纳答案by John Saunders

George,

乔治,

The answer is no.

答案是不。

Actually, some sort of indexing may be used internally, but only as an implementation detail. For instance, if you create a foreign key constraint, maybe that's assisted by an index. But it doesn't matter to a developer.

实际上,可以在内部使用某种索引,但仅作为实现细节。例如,如果您创建了一个外键约束,可能会得到一个索引的帮助。但这对开发人员来说并不重要。

回答by Jason Short

John above is correct. DataTables are disconnected in memory structures. They do not map to the physical implementation of the database.

上面的约翰是对的。数据表在内存结构中是断开的。它们不映射到数据库的物理实现。

The indexes on disk are used to speed up lookups because you don't have all the rows. If you have to load every row and scan them it is slow, so an index makes sense. In a DataTable you already have all the rows, so a comparison is fast already.

磁盘上的索引用于加快查找速度,因为您没有所有行。如果您必须加载每一行并扫描它们,它会很慢,因此索引是有意义的。在 DataTable 中,您已经拥有所有行,因此比较已经很快了。

回答by STW

Others have made the point that a DataSet is not intended to serve as a database system--just a representation of data. If you are working under the impression that a DataSet is a database then you are mistaken and might need to reconsider your implementation.

其他人指出,DataSet 不打算用作数据库系统——只是数据的表示。如果您认为 DataSet 是一个数据库,那么您就错了,可能需要重新考虑您的实现。

If you need a client-side database, consider using SQL Compact or SQL Lite, both are free redistributable Database systems which can be used without requiring separate installations or services. If you need something more full-featured the SQL Express is the next step up.

如果您需要客户端数据库,请考虑使用 SQL Compact 或 SQL Lite,两者都是免费的可再分发数据库系统,无需单独安装或服务即可使用。如果您需要更全面的功能,SQL Express 是下一步。

To help clarify though, DataSets/Tables are used in .NET development to temporarily hold data as needed. Think of them as the results of a SELECT query against a database; they are roughly similar to CSV files or other forms of tabular data--you can pull data into them from a database, work with the data, and then push the changes back to a database--but they, on their own, are not databases.

不过,为了帮助澄清,在 .NET 开发中使用 DataSets/Tables 来根据需要临时保存数据。将它们视为对数据库进行 SELECT 查询的结果;它们大致类似于 CSV 文件或其他形式的表格数据——您可以从数据库中将数据拉入其中,处理数据,然后将更改推送回数据库——但它们本身并不是数据库。

If you have a large collection of items which you need to keep in memory for one reason or another then you might consider building a lightweight DTO (data transfer object, Google it, they're very simple) and loading them into a HashTable. HashTables won't give you any form of relational data, but are very efficient at look-ups.

如果出于某种原因需要将大量项目保存在内存中,那么您可能会考虑构建一个轻量级 DTO(数据传输对象,谷歌它,它们非常简单)并将它们加载到 HashTable 中。HashTables 不会为您提供任何形式的关系数据,但在查找时非常有效。

回答by PRMan

DataTables have a PrimaryKey field that can serve as an index (they are fast already anyway). This field is not copied from the Primary Keys of the database (although that might be nice).

DataTables 有一个 PrimaryKey 字段,可以用作索引(无论如何它们已经很快了)。该字段不是从数据库的主键复制的(尽管这可能很好)。

回答by Negative

Actually George's question is not so "bad" as some people insist it is. (I am more and more convinced that there's no such think as "a bad question".)

实际上,乔治的问题并不像某些人坚持的那样“糟糕”。(我越来越相信没有“坏问题”这样的想法。)

I have a rather big table which I load into the memory, in a DataTable object. A lot of processing is done on lines from this table, a lot of times, on various (and different) subsets which I can easily describe as "WHERE ..." of SELECT clauses. Now with this DataTable I can run Select() - a method of DataTable class - but it is quite inefficient.

我有一个相当大的表,我将它加载到内存中,位于 DataTable 对象中。许多处理是在这个表中的行上完成的,很多时候是在各种(和不同的)子集上,我可以很容易地将其描述为 SELECT 子句的“WHERE ...”。现在有了这个 DataTable 我可以运行 Select() - DataTable 类的一种方法 - 但它效率很低。

In the end, I decided to load the DataTable sorted by specific columns and implemented my own quick search, instead of using the Select() function. It proved to be much faster, but of course it works only on those sorted columns. The trouble would have been avoided, had a DataTable had indexes.

最后,我决定加载按特定列排序的 DataTable 并实现我自己的快速搜索,而不是使用 Select() 函数。事实证明它要快得多,但当然它只适用于那些已排序的列。如果 DataTable 有索引,麻烦就可以避免了。

回答by Spiff

You can create a primary key for the datatable. Filter operations get a big boost if you are searching in the primary key field. Check out this link: here

您可以为数据表创建主键。如果您在主键字段中搜索,过滤操作会得到很大的提升。看看这个链接:这里

回答by Noam

I had the same problem with many queries from a large datatable that are not according to the primary key.

我对来自大型数据表的许多查询都遇到了同样的问题,这些查询不是根据主键。

The solution I found was to create DataView for each index I wanted to use, and then use it's Find and FindRows methods to extract the data.

我找到的解决方案是为我想要使用的每个索引创建 DataView,然后使用它的 Find 和 FindRows 方法来提取数据。

DataView creates an internal index on the DataTable and behaves virtually as an index for this purpose.

DataView 在 DataTable 上创建一个内部索引,并为此目的实际上充当一个索引。

In my case I was able to reduce 10,000 queries from 40 Seconds to ONE!!!

就我而言,我能够将 10,000 次查询从 40 秒减少到 1 秒!!!

回答by Steve Townsend

My reading of the docs is that the correct way to achieve this (if needed) is to use AsDataViewto produce a DataView(or LinqDataView) that's bound to the underlying table. If your DataTableis invariant then the DataViewcan be static to avoid redundant re-indexing.

我对文档的阅读是,实现此目的的正确方法(如果需要)是使用AsDataView生成绑定到基础表的DataView(或LinqDataView)。如果你DataTable是不变的,那么它DataView可以是静态的,以避免冗余的重新索引。

I am currently investigating Linq to DataSet, and this q was helpful to me, so thanks.

我目前正在调查Linq to DataSet,这个问题对我有帮助,所以谢谢。

回答by Fixee

DataTables are indexed if you (the coder) specify one or more DataColumns as the Primary Key. Interally ADO.NET uses a Red-Black tree to form this index giving log-time lookups. This Primary Key is not set automatically based on any underlying keying from the data provider.

如果您(编码人员)指定一个或多个 DataColumns 作为主键,则 DataTables 被索引。ADO.NET 内部使用红黑树来形成这个索引,提供日志时间查找。此主键不会根据来自数据提供者的任何底层键控自动设置。

回答by Steve Hibbert

No, but possibly yes.

不,但可能是。

You can set up your own indices on a DataTable, using a DataView. As you change the table, the DataView will be rebuilt, so the index should always be up to date.

您可以使用 DataView 在 DataTable 上设置自己的索引。当您更改表时,将重建 DataView,因此索引应始终是最新的。

I did some bench tests for my own app. I use a DataTable to approximate a Boost MultiIndexContainer. To create an index on a column call "Author", I initialise the DataTable, and then the DataView...

我为我自己的应用程序做了一些基准测试。我使用 DataTable 来近似 Boost MultiIndexContainer。要在名为“Author”的列上创建索引,我初始化了 DataTable,然后初始化了 DataView...

_dvChangesByAuthor = 
    new DataView(
        _dtChanges, 
        string.Empty, 
        "Author ASC", 
        DataViewRowState.CurrentRows);

To then pull data by Author from the table, you use the view's FindRows function...

然后要按作者从表中提取数据,请使用视图的 FindRows 函数...

            dataRowViews = _dvChangesByAuthor.FindRows(author);
            List<DataRow> returnRows = new List<DataRow>();
            foreach (DataRowView drv in dataRowViews)
            {
                returnRows.Add(drv.Row);
            }

I made a random large DataTable, and ran queries using DataTable.Select(), Linq-To-DataSet (with forced execution by exporting to list) and the above DataView method. The DataView method won easily. Linq took 5000 ticks, Select took over 26000 ticks, DataView took 192 ticks...

我制作了一个随机的大型 DataTable,并使用 DataTable.Select()、Linq-To-DataSet(通过导出到列表强制执行)和上述 DataView 方法运行查询。DataView 方法轻松获胜。Linq 用了 5000 个滴答,Select 用了 26000 个滴答,DataView 用了 192 个滴答……

LOC=20141121-14:46:32.863,UTC=20141121-14:46:32.863,DELTA=72718,THR=9,DEBUG,LOG=Program,volumeTest() - Running queries for author >TFYN_AUTHOR_047<
LOC=20141121-14:46:32.863,UTC=20141121-14:46:32.863,DELTA=72718,THR=9,DEBUG,LOG=RightsChangeTracker,GetChangesByAuthorUsingLinqToDataset() - Query elapsed time: 2 ms, 4934 ticks; Rows=65 
LOC=20141121-14:46:32.879,UTC=20141121-14:46:32.879,DELTA=72733,THR=9,DEBUG,LOG=RightsChangeTracker,GetChangesByAuthorUsingSelect() - Query elapsed time: 11 ms, 26575 ticks; Rows=65 
LOC=20141121-14:46:32.879,UTC=20141121-14:46:32.879,DELTA=72733,THR=9,DEBUG,LOG=RightsChangeTracker,GetChangesByAuthorUsingDataview() - Query elapsed time: 0 ms, 192 ticks; Rows=65

So, if you want indices on a DataTable, I would suggest DataView, if you can deal with the fact that the index is re-built when the data changes.

所以,如果你想在 DataTable 上建立索引,我会建议 DataView,如果你能处理在数据改变时重建索引的事实。