且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

什么是集合语义(在 .NET 中)?

更新时间:2023-02-14 19:13:06

.NET 中有很多集合类型,它们都有一些共同的行为,例如:

There are many collection types in .NET, and they all share some common behavior, for instance:

  • 您可以使用 foreach
  • 枚举它们
  • 他们有一个 Count 属性
  • 您可以使用 Add 方法添加项目
  • 等等...

此行为是集合类型所期望的,您猜对了:这一切都在 ICollection 接口中.我们来看看界面层次结构:

This behavior is expected from a collection type, and you guessed it right: it's all in the ICollection<T> interface. Let's take a look at the interface hierarchy:

  • IEnumerable 允许使用 foreach
  • 枚举您的类
  • ICollection 是一个 IEnumerable 代表一个集合:
    • 它允许检索项目Count
    • 可能可以添加/删除/清除集合中的项目
    • 集合可以是只读的,在这种情况下 IsReadOnly 应该返回 true
    • 还有其他一些辅助方法:Contains/CopyTo.
    • IEnumerable<T> allows your class to be enumerated with foreach
    • ICollection<T> is an IEnumerable<T> that represents a collection:
      • it allows to retrieve the item Count
      • you possibly can Add/Remove/Clear items from the collection
      • a collection could be read only, in that case IsReadOnly should return true
      • There's a couple other helper methods too: Contains/CopyTo.
      • 它添加了一个索引器
      • 一些索引相关的函数:Insert/RemoveAt
      • IndexOf

      您应该实现哪个接口是语义的问题:

      Which interface you should implement is a matter of semantics:

      IEnumerable 只是一个可枚举的序列.它应该只通过使用代码枚举一次,因为您永远不知道它在多次枚举时会如何表现.如果您多次枚举 IEnumerable,像 ReSharper 这样的工具甚至会发出警告.
      当然,大多数时候您可以安全地多次枚举它,但有时您不应该这样做.例如,枚举可以执行 SQL 查询(例如 Linq-to-SQL).

      IEnumerable<T> is just an enumerable sequence. It should only be enumerated once by consuming code, because you never know how it may behave on multiple enumerations. Tools like ReSharper will even emit warnings if you enumerate an IEnumerable<T> multiple times.
      Sure, most of the time you can safely enumerate it multiple times but there are times when you shouldn't. For instance, an enumeration could execute a SQL query (think Linq-to-SQL for instance).

      您可以通过定义一个函数来实现 IEnumerable:GetEnumerator,它返回 en IEnumerator.枚举器是一个对象,它是一种指向序列中当前元素的指针.它可以返回这个Current 值,并且它可以通过MoveNext 移动到下一个元素.它也是一次性的(并且在枚举结束时由 foreach 处理).

      You implement an IEnumerable<T> by defining one function: GetEnumerator which returns en IEnumerator<T>. An enumerator is an object that is a kind of pointer to a current element in your sequence. It can return this Current value, and it can move to the next element with MoveNext. It's also disposable (and it's disposed at the end of the enumeration by foreach).

      让我们分解一个 foreach 循环:

      Let's decompose a foreach loop:

      IEnumerable<T> sequence = ... // Whatever
      foreach (T item in sequence)
          DoSomething(item);
      

      这相当于:

      IEnumerator<T> enumerator = null;
      try
      {
          enumerator = sequence.GetEnumerator();
          while (enumerator.MoveNext())
          {
              T item = enumerator.Current;
              DoSomething(item);
          }
      }
      finally
      {
          if (enumerator != null)
              enumerator.Dispose();
      }
      

      为了记录,实现 IEnumerable 并不是使类可用于 foreach 的严格要求.鸭子打字在这里就足够了,但我离题太多了.

      For the record, implementing IEnumerable is not strictly required to make a class usable with a foreach. Duck typing is sufficient here but I'm digressing way too much.

      当然,您可以使用 yield 关键字轻松实现该模式:

      And of course you can implement the pattern easily with the yield keyword:

      public static IEnumerable<int> GetAnswer()
      {
          yield return 42;
      }
      

      这将创建一个私有类,它将为您实现 IEnumerable,因此您不必这样做.

      This will create a private class which will implement IEnumerable<int> for you so you don't have to.

      ICollection 代表一个集合,它可以安全地枚举多次.但你真的不知道它是什么样的收藏.它可以是一个集合、一个列表、一个字典等等.

      ICollection<T> represents a collection, which is safely enumerable multiple times. But you don't really know what kind of collection it it. It could be a set, a list, a dictionary, whatever.

      这是集合语义.

      一些例子:

      • T[] - 它实现了 ICollection 即使你不能Add/Remove
      • 列表
      • HashSet - 一个集合但不是列表的好例子
      • Dictionary - 是的,这是一个 ICollection>
      • LinkedList
      • ObservableCollection
      • T[] - it implements ICollection<T> even if you can't Add/Remove
      • List<T>
      • HashSet<T> - a good example of a collection but not a list
      • Dictionary<TKey, TValue> - yes, that's an ICollection<KeyValuePair<TKey, TValue>>
      • LinkedList<T>
      • ObservableCollection<T>

      IList<T> 让您知道集合是那种让您可以轻松地通过索引访问元素的集合(即 O(1) 时间).

      循环链表不是这种情况,因为它不仅需要 O(n) 时间,而且首先没有有意义的索引.

      This is not the case for your circular linked list, as not only it would need O(n) time, but there's no meaningful index in the first place.

      一些例子:

      • T[]
      • 列表
      • ObservableCollection

      注意 HashSetDictionary 不再在列表中.这些不是列表.LinkedList 在语义上是一个列表,但它不提供 O(1) 时间内按索引访问(它需要 O(n)em>).

      Notice that HashSet<T> and Dictionary<TKey, TValue> are no longer in the list for instance. These are not lists. LinkedList<T> is semantically a list, but it doesn't offer access by index in O(1) time (it requires O(n)).

      我应该提到 .NET 4.5 中的只读等效项:IReadOnlyCollectionIReadOnlyList.这些对于它们提供的协方差来说很好.

      I should mention there are read only equivalents that made in into .NET 4.5: IReadOnlyCollection<out T>, IReadOnlyList<out T>. These are nice for the covariance they provide.