且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

更好,更简单的“语义冲突”示例?

更新时间:2022-10-16 23:04:18

提出简单的相关并不明显。示例,并且此注释***地总结了以下原因:


如果更改临近,则琐碎的分辨率更可能是正确的(因为那些不正确的代码更有可能触及代码的相同部分,从而导致非平凡的冲突),在少数情况下,如果不是这样,则问题将相对迅速地以明显的方式显现出来。


[基本上是您的示例所说明的内容]


但是要检测由代码分隔的不同区域中的更改之间的合并所引起的语义冲突,可能需要比大多数程序员所掌握的更多的程序,或者在内核大小的项目中,要比任何程序员所能容纳的更多。 。

因此,即使您确实手动查看了这些三向差异,这也是一个相对无用的练习:付出的努力与信心的提高将不成比例。



实际上,我认为合并是一个红鲱鱼:

在代码的不同但相互依赖的部分之间出现这种语义冲突是不可避免的,因为它们可以分开发展

该并行开发过程的组织方式– DVCS; CVCS;压缩包和补丁;每个人都可以在网络共享上编辑相同的文件–完全无关紧要。

合并不会引起语义冲突,而编程会导致语义冲突。


换句话说,合并后我在实际代码中遇到的语义冲突的真实情况并不简单,而是相当复杂。






这就是最简单的示例,如。测试正在有效地探查我们的代码,以查看他们对代码语义的看法是否与代码的实际行为一致

  • 另一种有助于合并的技术



  • 通常人们会根据使功能分支变得容易的方式来尝试证明DVCS的合理性。但这遗漏了语义冲突的问题。

    如果您的功能在几天之内快速构建,那么您将遇到较少的语义冲突(如果少于一天,那么它就生效了)与CI相同)。但是,我们很少会看到这么短的功能分支。


    我认为需要在镜头活的分支与功能分支之间找到中间地带。

    如果您在相同功能分支上拥有一组开发人员,则经常合并是关键。


    I like to distinguish three different types of conflict from a version control system (VCS):

    • textual
    • syntactic
    • semantic

    A textual conflict is one that is detected by the merge or update process. This is flagged by the system. A commit of the result is not permitted by the VCS until the conflict is resolved.

    A syntactic conflict is not flagged by the VCS, but the result will not compile. Therefore this should also be picked up by even a slightly careful programmer. (A simple example might be a variable rename by Left and some added lines using that variable by Right. The merge will probably have an unresolved symbol. Alternatively, this might introduce a semantic conflict by variable hiding.)

    Finally, a semantic conflict is not flagged by the VCS, the result compiles, but the code may have problems running. In mild cases, incorrect results are produced. In severe cases, a crash could be introduced. Even these should be detected before commit by a very careful programmer, through either code review or unit testing.

    My example of a semantic conflict uses SVN (Subversion) and C++, but those choices are not really relevant to the essence of the question.

    The base code is:

    int i = 0;
    int odds = 0;
    while (i < 10)
    {
        if ((i & 1) != 0)
        {
            odds *= 10;
            odds += i;
        }
        // next
        ++ i;
    }
    assert (odds == 13579)
    

    The Left (L) and Right (R) changes are as follows.

    Left's 'optimisation' (changing the values the loop variable takes):

    int i = 1; // L
    int odds = 0;
    while (i < 10)
    {
        if ((i & 1) != 0)
        {
            odds *= 10;
            odds += i;
        }
        // next
        i += 2; // L
    }
    assert (odds == 13579)
    

    Right's 'optimisation' (changing how the loop variable is used):

    int i = 0;
    int odds = 0;
    while (i < 5) // R
    {
        odds *= 10;
        odds += 2 * i + 1; // R
        // next
        ++ i;
    }
    assert (odds == 13579)
    

    This is the result of a merge or update, and is not detected by SVN (which is correct behaviour for the VCS), so it is not a textual conflict. Note that it compiles, so it is not a syntactic conflict.

    int i = 1; // L
    int odds = 0;
    while (i < 5) // R
    {
        odds *= 10;
        odds += 2 * i + 1; // R
        // next
        i += 2; // L
    }
    assert (odds == 13579)
    

    The assert fails because odds is 37.

    So my question is as follows. Is there a simpler example than this? Is there a simple example where the compiled executable has a new crash?

    As a secondary question, are there cases of this that you have encountered in real code? Again, simple examples are especially welcome.

    It is not obvious to come up with simple relevant examples, and this comment sum up best why:

    If the changes are close by, then trivial resolutions are more likely to be correct (because those that are incorrect are more likely to touch the same parts of the code and thus result in non-trivial conflicts), and in those few cases where they aren’t, the problem will manifest itself relatively quickly and probably in an obvious way.

    [Which is basically what your example illustrates]

    But detecting semantic conflicts introduced by merges between changes in widely separated areas of the code is likely to require holding more of the program in your head than most programmers can – or in projects the size of the kernel, than any programmer can.
    So even if you did review those 3-way diffs manually, it would be a comparatively useless exercise: the effort would be far disproportionate with the gain in confidence.

    In fact, I would argue that merging is a red herring:
    this sort of semantic *** between disparate but interdependent parts of the code is inevitable the moment they can evolve separately.
    How this concurrent development process is organized – DVCS; CVCS; tarballs and patches; everyone edits the same files on a network share – is of no consequence at all to that fact.
    Merging doesn’t cause semantic ***es, programming causes semantic ***es.

    In other words, the real case of semantic conflicts I have encountered in real code after a merge were not simple, but rather quite complex.


    That being said, the simplest example, as illustrated by Martin Fowler in his article Feature Branch is a method rename:

    The problem I worry more about is a semantic conflict.
    A simple example of this is that if Professor Plum changes the name of a method that Reverend Green's code calls. Refactoring tools allow you to rename a method safely, but only on your code base.
    So if G1-6 contain new code that calls foo, Professor Plum can't tell in his code base as he doesn't have it. You only find out on the big merge.

    A function rename is a relatively obvious case of a semantic conflict.
    In practice they can be much more subtle.

    Tests are the key to discovering them, but the more code there is to merge the more likely you'll have conflicts and the harder it is to fix them.
    It's the risk of conflicts, particularly semantic conflicts, that make big merges scary.


    As Ole Lynge mentions in his answer (upvoted), Martin Fowler did write today (time of this edit) an post about "semantic conflict", including the following illustration:

    Again, this is based on function renaming, even though subtler case based on internal function refactoring are mentioned:

    The simplest example is that of renaming a function.
    Say I think that the method clcBl would be easier to work with if it were called calculateBill.

    So the first point here is that however powerful your tooling is, it will only protect you from textual conflicts.

    There are, however, a couple of strategies that can significantly help us deal with them

    • The first of these is SelfTestingCode. Tests are effectively probing our code to see if their view of the code's semantics are consistent with what the code actually does
    • The other technique that helps is to merge more often

    Often people try to justify DVCSs based on how they make feature branching easy. But that misses the issues of semantic conflicts.
    If your features are built quickly, within a couple of days, then you'll run into less semantic conflicts (and if less than a day, then it's in effect the same as CI). However we don't see such short feature branches very often.

    I think a middle ground needs to be found between shot-lived branches and feature-branches.
    And merging often is key if you have a group of developer on the same feature branch.