CROSS APPLY produces outer joinSQL counting distinct over partitionHow to use merge hints to isolate complex queries in SQL ServerSSIS Merge Join Trouble: 3 tables into 1Outer Apply vs Left Join PerformanceCROSS APPLY on Scalar functionFull outer join problemsWhy are these two queries having such different executions?Performance gap between WHERE IN (1,2,3,4) vs IN (select * from STRING_SPLIT('1,2,3,4',','))Using CROSS APPLY with GROUP BY and TOP 1 with duplicate dataPerformance improvement Outer ApplyGroup by sum based on under group in SQL Server

Is a single radon-daughter atom in air a solid?

Should I prioritize my 401k over my student loans?

How do I set an alias to a terminal line?

When to remove insignificant variables?

What was the Shuttle Carrier Aircraft escape tunnel?

Why use cross notes in sheet music for hip hop tracks?

How to Fill the plot like this?

How can I politely work my way around not liking coffee or beer when it comes to professional networking?

Unusual mail headers, evidence of an attempted attack. Have I been pwned?

Why does the Saturn V have standalone inter-stage rings?

Employer wants to use my work email account after I quit

Why don't countries like Japan just print more money?

Long term BTC investing

Can any NP-Complete Problem be solved using at most polynomial space (but while using exponential time?)

Relationship between woodwinds and brass in a marching band?

How does a pilot select the correct ILS when the airport has parallel runways?

"How can you guarantee that you won't change/quit job after just couple of months?" How to respond?

Can Ogre clerics use Purify Food and Drink on humanoid characters?

Is it damaging to turn off a small fridge for two days every week?

If I wouldn't want to read the story, is writing it still a good idea?

Hot coffee brewing solutions for deep woods camping

Loss of power when I remove item from the outlet

What did River say when she woke from her proto-comatose state?

What does "play with your toy’s toys" mean?



CROSS APPLY produces outer join


SQL counting distinct over partitionHow to use merge hints to isolate complex queries in SQL ServerSSIS Merge Join Trouble: 3 tables into 1Outer Apply vs Left Join PerformanceCROSS APPLY on Scalar functionFull outer join problemsWhy are these two queries having such different executions?Performance gap between WHERE IN (1,2,3,4) vs IN (select * from STRING_SPLIT('1,2,3,4',','))Using CROSS APPLY with GROUP BY and TOP 1 with duplicate dataPerformance improvement Outer ApplyGroup by sum based on under group in SQL Server






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








16















In answer to SQL counting distinct over partition Erik Darling posted this code to work around for the lack of COUNT(DISTINCT) OVER ():



SELECT *
FROM #MyTable AS mt
CROSS APPLY ( SELECT COUNT(DISTINCT mt2.Col_B) AS dc
FROM #MyTable AS mt2
WHERE mt2.Col_A = mt.Col_A
-- GROUP BY mt2.Col_A
) AS ca;


The query uses CROSS APPLY (not OUTER APPLY) so why is there an outer join in the execution plan instead of an inner join?



enter image description here



Also why does uncommenting the group by clause result in an inner join?



enter image description here



I don't think the data is important but copying from that given by kevinwhat on the other question:



create table #MyTable (
Col_A varchar(5),
Col_B int
)

insert into #MyTable values ('A',1)
insert into #MyTable values ('A',1)
insert into #MyTable values ('A',2)
insert into #MyTable values ('A',2)
insert into #MyTable values ('A',2)
insert into #MyTable values ('A',3)

insert into #MyTable values ('B',4)
insert into #MyTable values ('B',4)
insert into #MyTable values ('B',5)









share|improve this question






























    16















    In answer to SQL counting distinct over partition Erik Darling posted this code to work around for the lack of COUNT(DISTINCT) OVER ():



    SELECT *
    FROM #MyTable AS mt
    CROSS APPLY ( SELECT COUNT(DISTINCT mt2.Col_B) AS dc
    FROM #MyTable AS mt2
    WHERE mt2.Col_A = mt.Col_A
    -- GROUP BY mt2.Col_A
    ) AS ca;


    The query uses CROSS APPLY (not OUTER APPLY) so why is there an outer join in the execution plan instead of an inner join?



    enter image description here



    Also why does uncommenting the group by clause result in an inner join?



    enter image description here



    I don't think the data is important but copying from that given by kevinwhat on the other question:



    create table #MyTable (
    Col_A varchar(5),
    Col_B int
    )

    insert into #MyTable values ('A',1)
    insert into #MyTable values ('A',1)
    insert into #MyTable values ('A',2)
    insert into #MyTable values ('A',2)
    insert into #MyTable values ('A',2)
    insert into #MyTable values ('A',3)

    insert into #MyTable values ('B',4)
    insert into #MyTable values ('B',4)
    insert into #MyTable values ('B',5)









    share|improve this question


























      16












      16








      16


      4






      In answer to SQL counting distinct over partition Erik Darling posted this code to work around for the lack of COUNT(DISTINCT) OVER ():



      SELECT *
      FROM #MyTable AS mt
      CROSS APPLY ( SELECT COUNT(DISTINCT mt2.Col_B) AS dc
      FROM #MyTable AS mt2
      WHERE mt2.Col_A = mt.Col_A
      -- GROUP BY mt2.Col_A
      ) AS ca;


      The query uses CROSS APPLY (not OUTER APPLY) so why is there an outer join in the execution plan instead of an inner join?



      enter image description here



      Also why does uncommenting the group by clause result in an inner join?



      enter image description here



      I don't think the data is important but copying from that given by kevinwhat on the other question:



      create table #MyTable (
      Col_A varchar(5),
      Col_B int
      )

      insert into #MyTable values ('A',1)
      insert into #MyTable values ('A',1)
      insert into #MyTable values ('A',2)
      insert into #MyTable values ('A',2)
      insert into #MyTable values ('A',2)
      insert into #MyTable values ('A',3)

      insert into #MyTable values ('B',4)
      insert into #MyTable values ('B',4)
      insert into #MyTable values ('B',5)









      share|improve this question
















      In answer to SQL counting distinct over partition Erik Darling posted this code to work around for the lack of COUNT(DISTINCT) OVER ():



      SELECT *
      FROM #MyTable AS mt
      CROSS APPLY ( SELECT COUNT(DISTINCT mt2.Col_B) AS dc
      FROM #MyTable AS mt2
      WHERE mt2.Col_A = mt.Col_A
      -- GROUP BY mt2.Col_A
      ) AS ca;


      The query uses CROSS APPLY (not OUTER APPLY) so why is there an outer join in the execution plan instead of an inner join?



      enter image description here



      Also why does uncommenting the group by clause result in an inner join?



      enter image description here



      I don't think the data is important but copying from that given by kevinwhat on the other question:



      create table #MyTable (
      Col_A varchar(5),
      Col_B int
      )

      insert into #MyTable values ('A',1)
      insert into #MyTable values ('A',1)
      insert into #MyTable values ('A',2)
      insert into #MyTable values ('A',2)
      insert into #MyTable values ('A',2)
      insert into #MyTable values ('A',3)

      insert into #MyTable values ('B',4)
      insert into #MyTable values ('B',4)
      insert into #MyTable values ('B',5)






      sql-server execution-plan cross-apply






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Jun 6 at 10:28







      Paul White

















      asked Jun 5 at 14:08









      Paul WhitePaul White

      56.2k14295469




      56.2k14295469




















          2 Answers
          2






          active

          oldest

          votes


















          19














          Summary



          SQL Server uses the correct join (inner or outer) and adds projections where necessary to honour all the semantics of the original query when performing internal translations between apply and join.



          The differences in the plans can all be explained by the different semantics of aggregates with and without a group by clause in SQL Server.




          Details



          Join vs Apply



          We will need to be able to distinguish between an apply and a join:




          • Apply



            The inner (lower) input of the apply is run for each row of the outer (upper) input, with one or more inner side parameter values provided by the current outer row. The overall result of the apply is the combination (union all) of all the rows produced by the parameterized inner side executions. The presence of parameters means apply is sometimes referred to as a correlated join.



            An apply is always implemented in execution plans by the Nested Loops operator. The operator will have an Outer References property rather than join predicates. The outer references are the parameters passed from the outer side to the inner side on each iteration of the loop.




          • Join



            A join evaluates its join predicate at the join operator. The join may generally be implemented by Hash Match, Merge, or Nested Loops operators in SQL Server.



            When Nested Loops is chosen, it can be distinguished from an apply by the lack of Outer References (and usually the presence of a join predicate). The inner input of a join never references values from the outer input - the inner side is still executed once for each outer row, but inner side executions do not depend on any values from the current outer row.



          For more details see my post Apply versus Nested Loops Join.




          ...why is there an outer join in the execution plan instead of an inner join?




          The outer join arises when the optimizer transforms an apply to a join (using a rule called ApplyHandler) to see if it can find a cheaper join-based plan. The join is required to be an outer join for correctness when the apply contains a scalar aggregate. An inner join would not be guaranteed to produce the same results as the original apply as we will see.



          Scalar and Vector Aggregates



          • An aggregate without a corresponding GROUP BY clause is a scalar aggregate.

          • An aggregate with a corresponding GROUP BY clause is a vector aggregate.

          In SQL Server, a scalar aggregate will always produce a row, even if it is given no rows to aggregate. For example, the scalar COUNT aggregate of no rows is zero. A vector COUNT aggregate of no rows is the empty set (no rows at all).



          The following toy queries illustrate the difference. You can also read more about scalar and vector aggregates in my article Fun with Scalar and Vector Aggregates.



          -- Produces a single zero value
          SELECT COUNT_BIG(*) FROM #MyTable AS MT WHERE 0 = 1;

          -- Produces no rows
          SELECT COUNT_BIG(*) FROM #MyTable AS MT WHERE 0 = 1 GROUP BY ();


          db<>fiddle demo



          Transforming apply to join



          I mentioned before that the join is required to be an outer join for correctness when the original apply contains a scalar aggregate. To show why this is the case in detail, I will use a simplified example of the question query:



          DECLARE @A table (A integer NULL, B integer NULL);
          DECLARE @B table (A integer NULL, B integer NULL);

          INSERT @A (A, B) VALUES (1, 1);
          INSERT @B (A, B) VALUES (2, 2);

          SELECT * FROM @A AS A
          CROSS APPLY (SELECT c = COUNT_BIG(*) FROM @B AS B WHERE B.A = A.A) AS CA;


          The correct result for column c is zero, because the COUNT_BIG is a scalar aggregate. When translating this apply query to join form, SQL Server generates an internal alternative that would look similar to the following if it were expressed in T-SQL:



          SELECT A.*, c = COALESCE(J1.c, 0)
          FROM @A AS A
          LEFT JOIN
          (
          SELECT B.A, c = COUNT_BIG(*)
          FROM @B AS B
          GROUP BY B.A
          ) AS J1
          ON J1.A = A.A;


          To rewrite the apply as an uncorrelated join, we have to introduce a GROUP BY in the derived table (otherwise there could be no A column to join on). The join has to be an outer join so each row from table @A continues to produce a row in the output. The left join will produce a NULL for column c when the join predicate does not evaluate to true. That NULL needs to be translated to zero by COALESCE to complete a correct transformation from apply.



          The demo below shows how both outer join and COALESCE are required to produce the same results using join as the original apply query:



          db<>fiddle demo



          With the GROUP BY




          ...why does uncommenting the group by clause result in an inner join?




          Continuing the simplified example, but adding a GROUP BY:



          DECLARE @A table (A integer NULL, B integer NULL);
          DECLARE @B table (A integer NULL, B integer NULL);

          INSERT @A (A, B) VALUES (1, 1);
          INSERT @B (A, B) VALUES (2, 2);

          -- Original
          SELECT * FROM @A AS A
          CROSS APPLY
          (SELECT c = COUNT_BIG(*) FROM @B AS B WHERE B.A = A.A GROUP BY B.A) AS CA;



          The COUNT_BIG is now a vector aggregate, so the correct result for an empty input set is no longer zero, it is no row at all. In other words, running the statements above produces no output.



          These semantics are much easier to honour when translating from apply to join, since CROSS APPLY naturally rejects any outer row that generates no inner side rows. We can therefore safely use an inner join now, with no extra expression projection:



          -- Rewrite
          SELECT A.*, J1.c
          FROM @A AS A
          JOIN
          (
          SELECT B.A, c = COUNT_BIG(*)
          FROM @B AS B
          GROUP BY B.A
          ) AS J1
          ON J1.A = A.A;


          The demo below shows that the inner join rewrite produces the same results as the original apply with vector aggregate:



          db<>fiddle demo



          The optimizer happens to choose a merge inner join with the small table because it finds a cheap join plan quickly (good enough plan found). The cost based optimizer may go on to rewrite the join back to an apply - perhaps finding a cheaper apply plan, as it will here if a loop join or forceseek hint is used - but it is not worth the effort in this case.



          Notes



          The simplified examples use different tables with different contents to show the semantic differences more clearly.



          One could argue that the optimizer ought to be able to reason about a self-join not being capable generating any mismatched (non-joining) rows, but it does not contain that logic today. Accessing the same table multiple times in a query is not guaranteed to produce the same results in general anyway, depending on isolation level and concurrent activity.



          The optimizer worries about these semantics and edge cases so you don't have to.




          Bonus: Inner Apply Plan



          SQL Server can produce an inner apply plan (not an inner join plan!) for the example query, it just chooses not to for cost reasons. The cost of the outer join plan shown in the question is 0.02898 units on my laptop's SQL Server 2017 instance.



          You can force an apply (correlated join) plan using undocumented and unsupported trace flag 9114 (which disables ApplyHandler etc.) just for illustration:



          SELECT *
          FROM #MyTable AS mt
          CROSS APPLY
          (
          SELECT COUNT_BIG(DISTINCT mt2.Col_B) AS dc
          FROM #MyTable AS mt2
          WHERE mt2.Col_A = mt.Col_A
          --GROUP BY mt2.Col_A
          ) AS ca
          OPTION (QUERYTRACEON 9114);


          This produces an apply nested loops plan with a lazy index spool. The total estimated cost is 0.0463983 (higher than the selected plan):



          Index Spool apply plan



          Note that the execution plan using apply nested loops produces correct results using "inner join" semantics regardless of the presence of the GROUP BY clause.



          In the real world, we would typically have an index to support a seek on the inner side of the apply to encourage SQL Server to choose this option naturally, for example:



          CREATE INDEX i ON #MyTable (Col_A, Col_B);


          db<>fiddle demo






          share|improve this answer

























          • "When Nested Loops is chosen, it can be distinguished from an apply by the lack of Outer References (and usually the presence of a join predicate)." Are you sure? If I write a simple inner join query then then the nested loop operator shows an outer reference. "The inner input of a join never references values from the outer input" Surely by definition the inner loop branch must always reference the outer input, unless the loop join is a cross join?

            – knuckles
            Jun 6 at 14:23






          • 1





            @knuckles Yes I am sure. I wrote Apply versus Nested Loops Join to address your questions and added a link to it to my answer. Cheers.

            – Paul White
            Jun 8 at 13:26


















          -3














          Cross Apply is a logical operation on the data. When deciding how to get that data SQL Server chooses the appropriate physical operator to get the data you want.



          There is no physical apply operator and SQL Server translates it into the appropriate and hopefully efficient join operator.



          You can find a list of the physical operators in the link below.



          https://docs.microsoft.com/en-us/sql/relational-databases/showplan-logical-and-physical-operators-reference?view=sql-server-2017




          The query optimizer creates a query plan as a tree consisting of logical operators. After the query optimizer creates the plan, the query optimizer chooses the most efficient physical operator for each logical operator. The query optimizer uses a cost-based approach to determine which physical operator will implement a logical operator.



          Usually, a logical operation can be implemented by multiple physical
          operators. However, in rare cases, a physical operator can implement
          multiple logical operations as well.




          edit/ It seems I understood your question wrong. SQL server will normally choose the most appropriate operator. Your query doesn't need to return values for all combinations of both tables which is when a cross join would be used. Just calculating the value you want for each row suffices which is what is done here.






          share|improve this answer

























            Your Answer








            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "182"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: false,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            imageUploader:
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            ,
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );













            draft saved

            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f239865%2fcross-apply-produces-outer-join%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown

























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            19














            Summary



            SQL Server uses the correct join (inner or outer) and adds projections where necessary to honour all the semantics of the original query when performing internal translations between apply and join.



            The differences in the plans can all be explained by the different semantics of aggregates with and without a group by clause in SQL Server.




            Details



            Join vs Apply



            We will need to be able to distinguish between an apply and a join:




            • Apply



              The inner (lower) input of the apply is run for each row of the outer (upper) input, with one or more inner side parameter values provided by the current outer row. The overall result of the apply is the combination (union all) of all the rows produced by the parameterized inner side executions. The presence of parameters means apply is sometimes referred to as a correlated join.



              An apply is always implemented in execution plans by the Nested Loops operator. The operator will have an Outer References property rather than join predicates. The outer references are the parameters passed from the outer side to the inner side on each iteration of the loop.




            • Join



              A join evaluates its join predicate at the join operator. The join may generally be implemented by Hash Match, Merge, or Nested Loops operators in SQL Server.



              When Nested Loops is chosen, it can be distinguished from an apply by the lack of Outer References (and usually the presence of a join predicate). The inner input of a join never references values from the outer input - the inner side is still executed once for each outer row, but inner side executions do not depend on any values from the current outer row.



            For more details see my post Apply versus Nested Loops Join.




            ...why is there an outer join in the execution plan instead of an inner join?




            The outer join arises when the optimizer transforms an apply to a join (using a rule called ApplyHandler) to see if it can find a cheaper join-based plan. The join is required to be an outer join for correctness when the apply contains a scalar aggregate. An inner join would not be guaranteed to produce the same results as the original apply as we will see.



            Scalar and Vector Aggregates



            • An aggregate without a corresponding GROUP BY clause is a scalar aggregate.

            • An aggregate with a corresponding GROUP BY clause is a vector aggregate.

            In SQL Server, a scalar aggregate will always produce a row, even if it is given no rows to aggregate. For example, the scalar COUNT aggregate of no rows is zero. A vector COUNT aggregate of no rows is the empty set (no rows at all).



            The following toy queries illustrate the difference. You can also read more about scalar and vector aggregates in my article Fun with Scalar and Vector Aggregates.



            -- Produces a single zero value
            SELECT COUNT_BIG(*) FROM #MyTable AS MT WHERE 0 = 1;

            -- Produces no rows
            SELECT COUNT_BIG(*) FROM #MyTable AS MT WHERE 0 = 1 GROUP BY ();


            db<>fiddle demo



            Transforming apply to join



            I mentioned before that the join is required to be an outer join for correctness when the original apply contains a scalar aggregate. To show why this is the case in detail, I will use a simplified example of the question query:



            DECLARE @A table (A integer NULL, B integer NULL);
            DECLARE @B table (A integer NULL, B integer NULL);

            INSERT @A (A, B) VALUES (1, 1);
            INSERT @B (A, B) VALUES (2, 2);

            SELECT * FROM @A AS A
            CROSS APPLY (SELECT c = COUNT_BIG(*) FROM @B AS B WHERE B.A = A.A) AS CA;


            The correct result for column c is zero, because the COUNT_BIG is a scalar aggregate. When translating this apply query to join form, SQL Server generates an internal alternative that would look similar to the following if it were expressed in T-SQL:



            SELECT A.*, c = COALESCE(J1.c, 0)
            FROM @A AS A
            LEFT JOIN
            (
            SELECT B.A, c = COUNT_BIG(*)
            FROM @B AS B
            GROUP BY B.A
            ) AS J1
            ON J1.A = A.A;


            To rewrite the apply as an uncorrelated join, we have to introduce a GROUP BY in the derived table (otherwise there could be no A column to join on). The join has to be an outer join so each row from table @A continues to produce a row in the output. The left join will produce a NULL for column c when the join predicate does not evaluate to true. That NULL needs to be translated to zero by COALESCE to complete a correct transformation from apply.



            The demo below shows how both outer join and COALESCE are required to produce the same results using join as the original apply query:



            db<>fiddle demo



            With the GROUP BY




            ...why does uncommenting the group by clause result in an inner join?




            Continuing the simplified example, but adding a GROUP BY:



            DECLARE @A table (A integer NULL, B integer NULL);
            DECLARE @B table (A integer NULL, B integer NULL);

            INSERT @A (A, B) VALUES (1, 1);
            INSERT @B (A, B) VALUES (2, 2);

            -- Original
            SELECT * FROM @A AS A
            CROSS APPLY
            (SELECT c = COUNT_BIG(*) FROM @B AS B WHERE B.A = A.A GROUP BY B.A) AS CA;



            The COUNT_BIG is now a vector aggregate, so the correct result for an empty input set is no longer zero, it is no row at all. In other words, running the statements above produces no output.



            These semantics are much easier to honour when translating from apply to join, since CROSS APPLY naturally rejects any outer row that generates no inner side rows. We can therefore safely use an inner join now, with no extra expression projection:



            -- Rewrite
            SELECT A.*, J1.c
            FROM @A AS A
            JOIN
            (
            SELECT B.A, c = COUNT_BIG(*)
            FROM @B AS B
            GROUP BY B.A
            ) AS J1
            ON J1.A = A.A;


            The demo below shows that the inner join rewrite produces the same results as the original apply with vector aggregate:



            db<>fiddle demo



            The optimizer happens to choose a merge inner join with the small table because it finds a cheap join plan quickly (good enough plan found). The cost based optimizer may go on to rewrite the join back to an apply - perhaps finding a cheaper apply plan, as it will here if a loop join or forceseek hint is used - but it is not worth the effort in this case.



            Notes



            The simplified examples use different tables with different contents to show the semantic differences more clearly.



            One could argue that the optimizer ought to be able to reason about a self-join not being capable generating any mismatched (non-joining) rows, but it does not contain that logic today. Accessing the same table multiple times in a query is not guaranteed to produce the same results in general anyway, depending on isolation level and concurrent activity.



            The optimizer worries about these semantics and edge cases so you don't have to.




            Bonus: Inner Apply Plan



            SQL Server can produce an inner apply plan (not an inner join plan!) for the example query, it just chooses not to for cost reasons. The cost of the outer join plan shown in the question is 0.02898 units on my laptop's SQL Server 2017 instance.



            You can force an apply (correlated join) plan using undocumented and unsupported trace flag 9114 (which disables ApplyHandler etc.) just for illustration:



            SELECT *
            FROM #MyTable AS mt
            CROSS APPLY
            (
            SELECT COUNT_BIG(DISTINCT mt2.Col_B) AS dc
            FROM #MyTable AS mt2
            WHERE mt2.Col_A = mt.Col_A
            --GROUP BY mt2.Col_A
            ) AS ca
            OPTION (QUERYTRACEON 9114);


            This produces an apply nested loops plan with a lazy index spool. The total estimated cost is 0.0463983 (higher than the selected plan):



            Index Spool apply plan



            Note that the execution plan using apply nested loops produces correct results using "inner join" semantics regardless of the presence of the GROUP BY clause.



            In the real world, we would typically have an index to support a seek on the inner side of the apply to encourage SQL Server to choose this option naturally, for example:



            CREATE INDEX i ON #MyTable (Col_A, Col_B);


            db<>fiddle demo






            share|improve this answer

























            • "When Nested Loops is chosen, it can be distinguished from an apply by the lack of Outer References (and usually the presence of a join predicate)." Are you sure? If I write a simple inner join query then then the nested loop operator shows an outer reference. "The inner input of a join never references values from the outer input" Surely by definition the inner loop branch must always reference the outer input, unless the loop join is a cross join?

              – knuckles
              Jun 6 at 14:23






            • 1





              @knuckles Yes I am sure. I wrote Apply versus Nested Loops Join to address your questions and added a link to it to my answer. Cheers.

              – Paul White
              Jun 8 at 13:26















            19














            Summary



            SQL Server uses the correct join (inner or outer) and adds projections where necessary to honour all the semantics of the original query when performing internal translations between apply and join.



            The differences in the plans can all be explained by the different semantics of aggregates with and without a group by clause in SQL Server.




            Details



            Join vs Apply



            We will need to be able to distinguish between an apply and a join:




            • Apply



              The inner (lower) input of the apply is run for each row of the outer (upper) input, with one or more inner side parameter values provided by the current outer row. The overall result of the apply is the combination (union all) of all the rows produced by the parameterized inner side executions. The presence of parameters means apply is sometimes referred to as a correlated join.



              An apply is always implemented in execution plans by the Nested Loops operator. The operator will have an Outer References property rather than join predicates. The outer references are the parameters passed from the outer side to the inner side on each iteration of the loop.




            • Join



              A join evaluates its join predicate at the join operator. The join may generally be implemented by Hash Match, Merge, or Nested Loops operators in SQL Server.



              When Nested Loops is chosen, it can be distinguished from an apply by the lack of Outer References (and usually the presence of a join predicate). The inner input of a join never references values from the outer input - the inner side is still executed once for each outer row, but inner side executions do not depend on any values from the current outer row.



            For more details see my post Apply versus Nested Loops Join.




            ...why is there an outer join in the execution plan instead of an inner join?




            The outer join arises when the optimizer transforms an apply to a join (using a rule called ApplyHandler) to see if it can find a cheaper join-based plan. The join is required to be an outer join for correctness when the apply contains a scalar aggregate. An inner join would not be guaranteed to produce the same results as the original apply as we will see.



            Scalar and Vector Aggregates



            • An aggregate without a corresponding GROUP BY clause is a scalar aggregate.

            • An aggregate with a corresponding GROUP BY clause is a vector aggregate.

            In SQL Server, a scalar aggregate will always produce a row, even if it is given no rows to aggregate. For example, the scalar COUNT aggregate of no rows is zero. A vector COUNT aggregate of no rows is the empty set (no rows at all).



            The following toy queries illustrate the difference. You can also read more about scalar and vector aggregates in my article Fun with Scalar and Vector Aggregates.



            -- Produces a single zero value
            SELECT COUNT_BIG(*) FROM #MyTable AS MT WHERE 0 = 1;

            -- Produces no rows
            SELECT COUNT_BIG(*) FROM #MyTable AS MT WHERE 0 = 1 GROUP BY ();


            db<>fiddle demo



            Transforming apply to join



            I mentioned before that the join is required to be an outer join for correctness when the original apply contains a scalar aggregate. To show why this is the case in detail, I will use a simplified example of the question query:



            DECLARE @A table (A integer NULL, B integer NULL);
            DECLARE @B table (A integer NULL, B integer NULL);

            INSERT @A (A, B) VALUES (1, 1);
            INSERT @B (A, B) VALUES (2, 2);

            SELECT * FROM @A AS A
            CROSS APPLY (SELECT c = COUNT_BIG(*) FROM @B AS B WHERE B.A = A.A) AS CA;


            The correct result for column c is zero, because the COUNT_BIG is a scalar aggregate. When translating this apply query to join form, SQL Server generates an internal alternative that would look similar to the following if it were expressed in T-SQL:



            SELECT A.*, c = COALESCE(J1.c, 0)
            FROM @A AS A
            LEFT JOIN
            (
            SELECT B.A, c = COUNT_BIG(*)
            FROM @B AS B
            GROUP BY B.A
            ) AS J1
            ON J1.A = A.A;


            To rewrite the apply as an uncorrelated join, we have to introduce a GROUP BY in the derived table (otherwise there could be no A column to join on). The join has to be an outer join so each row from table @A continues to produce a row in the output. The left join will produce a NULL for column c when the join predicate does not evaluate to true. That NULL needs to be translated to zero by COALESCE to complete a correct transformation from apply.



            The demo below shows how both outer join and COALESCE are required to produce the same results using join as the original apply query:



            db<>fiddle demo



            With the GROUP BY




            ...why does uncommenting the group by clause result in an inner join?




            Continuing the simplified example, but adding a GROUP BY:



            DECLARE @A table (A integer NULL, B integer NULL);
            DECLARE @B table (A integer NULL, B integer NULL);

            INSERT @A (A, B) VALUES (1, 1);
            INSERT @B (A, B) VALUES (2, 2);

            -- Original
            SELECT * FROM @A AS A
            CROSS APPLY
            (SELECT c = COUNT_BIG(*) FROM @B AS B WHERE B.A = A.A GROUP BY B.A) AS CA;



            The COUNT_BIG is now a vector aggregate, so the correct result for an empty input set is no longer zero, it is no row at all. In other words, running the statements above produces no output.



            These semantics are much easier to honour when translating from apply to join, since CROSS APPLY naturally rejects any outer row that generates no inner side rows. We can therefore safely use an inner join now, with no extra expression projection:



            -- Rewrite
            SELECT A.*, J1.c
            FROM @A AS A
            JOIN
            (
            SELECT B.A, c = COUNT_BIG(*)
            FROM @B AS B
            GROUP BY B.A
            ) AS J1
            ON J1.A = A.A;


            The demo below shows that the inner join rewrite produces the same results as the original apply with vector aggregate:



            db<>fiddle demo



            The optimizer happens to choose a merge inner join with the small table because it finds a cheap join plan quickly (good enough plan found). The cost based optimizer may go on to rewrite the join back to an apply - perhaps finding a cheaper apply plan, as it will here if a loop join or forceseek hint is used - but it is not worth the effort in this case.



            Notes



            The simplified examples use different tables with different contents to show the semantic differences more clearly.



            One could argue that the optimizer ought to be able to reason about a self-join not being capable generating any mismatched (non-joining) rows, but it does not contain that logic today. Accessing the same table multiple times in a query is not guaranteed to produce the same results in general anyway, depending on isolation level and concurrent activity.



            The optimizer worries about these semantics and edge cases so you don't have to.




            Bonus: Inner Apply Plan



            SQL Server can produce an inner apply plan (not an inner join plan!) for the example query, it just chooses not to for cost reasons. The cost of the outer join plan shown in the question is 0.02898 units on my laptop's SQL Server 2017 instance.



            You can force an apply (correlated join) plan using undocumented and unsupported trace flag 9114 (which disables ApplyHandler etc.) just for illustration:



            SELECT *
            FROM #MyTable AS mt
            CROSS APPLY
            (
            SELECT COUNT_BIG(DISTINCT mt2.Col_B) AS dc
            FROM #MyTable AS mt2
            WHERE mt2.Col_A = mt.Col_A
            --GROUP BY mt2.Col_A
            ) AS ca
            OPTION (QUERYTRACEON 9114);


            This produces an apply nested loops plan with a lazy index spool. The total estimated cost is 0.0463983 (higher than the selected plan):



            Index Spool apply plan



            Note that the execution plan using apply nested loops produces correct results using "inner join" semantics regardless of the presence of the GROUP BY clause.



            In the real world, we would typically have an index to support a seek on the inner side of the apply to encourage SQL Server to choose this option naturally, for example:



            CREATE INDEX i ON #MyTable (Col_A, Col_B);


            db<>fiddle demo






            share|improve this answer

























            • "When Nested Loops is chosen, it can be distinguished from an apply by the lack of Outer References (and usually the presence of a join predicate)." Are you sure? If I write a simple inner join query then then the nested loop operator shows an outer reference. "The inner input of a join never references values from the outer input" Surely by definition the inner loop branch must always reference the outer input, unless the loop join is a cross join?

              – knuckles
              Jun 6 at 14:23






            • 1





              @knuckles Yes I am sure. I wrote Apply versus Nested Loops Join to address your questions and added a link to it to my answer. Cheers.

              – Paul White
              Jun 8 at 13:26













            19












            19








            19







            Summary



            SQL Server uses the correct join (inner or outer) and adds projections where necessary to honour all the semantics of the original query when performing internal translations between apply and join.



            The differences in the plans can all be explained by the different semantics of aggregates with and without a group by clause in SQL Server.




            Details



            Join vs Apply



            We will need to be able to distinguish between an apply and a join:




            • Apply



              The inner (lower) input of the apply is run for each row of the outer (upper) input, with one or more inner side parameter values provided by the current outer row. The overall result of the apply is the combination (union all) of all the rows produced by the parameterized inner side executions. The presence of parameters means apply is sometimes referred to as a correlated join.



              An apply is always implemented in execution plans by the Nested Loops operator. The operator will have an Outer References property rather than join predicates. The outer references are the parameters passed from the outer side to the inner side on each iteration of the loop.




            • Join



              A join evaluates its join predicate at the join operator. The join may generally be implemented by Hash Match, Merge, or Nested Loops operators in SQL Server.



              When Nested Loops is chosen, it can be distinguished from an apply by the lack of Outer References (and usually the presence of a join predicate). The inner input of a join never references values from the outer input - the inner side is still executed once for each outer row, but inner side executions do not depend on any values from the current outer row.



            For more details see my post Apply versus Nested Loops Join.




            ...why is there an outer join in the execution plan instead of an inner join?




            The outer join arises when the optimizer transforms an apply to a join (using a rule called ApplyHandler) to see if it can find a cheaper join-based plan. The join is required to be an outer join for correctness when the apply contains a scalar aggregate. An inner join would not be guaranteed to produce the same results as the original apply as we will see.



            Scalar and Vector Aggregates



            • An aggregate without a corresponding GROUP BY clause is a scalar aggregate.

            • An aggregate with a corresponding GROUP BY clause is a vector aggregate.

            In SQL Server, a scalar aggregate will always produce a row, even if it is given no rows to aggregate. For example, the scalar COUNT aggregate of no rows is zero. A vector COUNT aggregate of no rows is the empty set (no rows at all).



            The following toy queries illustrate the difference. You can also read more about scalar and vector aggregates in my article Fun with Scalar and Vector Aggregates.



            -- Produces a single zero value
            SELECT COUNT_BIG(*) FROM #MyTable AS MT WHERE 0 = 1;

            -- Produces no rows
            SELECT COUNT_BIG(*) FROM #MyTable AS MT WHERE 0 = 1 GROUP BY ();


            db<>fiddle demo



            Transforming apply to join



            I mentioned before that the join is required to be an outer join for correctness when the original apply contains a scalar aggregate. To show why this is the case in detail, I will use a simplified example of the question query:



            DECLARE @A table (A integer NULL, B integer NULL);
            DECLARE @B table (A integer NULL, B integer NULL);

            INSERT @A (A, B) VALUES (1, 1);
            INSERT @B (A, B) VALUES (2, 2);

            SELECT * FROM @A AS A
            CROSS APPLY (SELECT c = COUNT_BIG(*) FROM @B AS B WHERE B.A = A.A) AS CA;


            The correct result for column c is zero, because the COUNT_BIG is a scalar aggregate. When translating this apply query to join form, SQL Server generates an internal alternative that would look similar to the following if it were expressed in T-SQL:



            SELECT A.*, c = COALESCE(J1.c, 0)
            FROM @A AS A
            LEFT JOIN
            (
            SELECT B.A, c = COUNT_BIG(*)
            FROM @B AS B
            GROUP BY B.A
            ) AS J1
            ON J1.A = A.A;


            To rewrite the apply as an uncorrelated join, we have to introduce a GROUP BY in the derived table (otherwise there could be no A column to join on). The join has to be an outer join so each row from table @A continues to produce a row in the output. The left join will produce a NULL for column c when the join predicate does not evaluate to true. That NULL needs to be translated to zero by COALESCE to complete a correct transformation from apply.



            The demo below shows how both outer join and COALESCE are required to produce the same results using join as the original apply query:



            db<>fiddle demo



            With the GROUP BY




            ...why does uncommenting the group by clause result in an inner join?




            Continuing the simplified example, but adding a GROUP BY:



            DECLARE @A table (A integer NULL, B integer NULL);
            DECLARE @B table (A integer NULL, B integer NULL);

            INSERT @A (A, B) VALUES (1, 1);
            INSERT @B (A, B) VALUES (2, 2);

            -- Original
            SELECT * FROM @A AS A
            CROSS APPLY
            (SELECT c = COUNT_BIG(*) FROM @B AS B WHERE B.A = A.A GROUP BY B.A) AS CA;



            The COUNT_BIG is now a vector aggregate, so the correct result for an empty input set is no longer zero, it is no row at all. In other words, running the statements above produces no output.



            These semantics are much easier to honour when translating from apply to join, since CROSS APPLY naturally rejects any outer row that generates no inner side rows. We can therefore safely use an inner join now, with no extra expression projection:



            -- Rewrite
            SELECT A.*, J1.c
            FROM @A AS A
            JOIN
            (
            SELECT B.A, c = COUNT_BIG(*)
            FROM @B AS B
            GROUP BY B.A
            ) AS J1
            ON J1.A = A.A;


            The demo below shows that the inner join rewrite produces the same results as the original apply with vector aggregate:



            db<>fiddle demo



            The optimizer happens to choose a merge inner join with the small table because it finds a cheap join plan quickly (good enough plan found). The cost based optimizer may go on to rewrite the join back to an apply - perhaps finding a cheaper apply plan, as it will here if a loop join or forceseek hint is used - but it is not worth the effort in this case.



            Notes



            The simplified examples use different tables with different contents to show the semantic differences more clearly.



            One could argue that the optimizer ought to be able to reason about a self-join not being capable generating any mismatched (non-joining) rows, but it does not contain that logic today. Accessing the same table multiple times in a query is not guaranteed to produce the same results in general anyway, depending on isolation level and concurrent activity.



            The optimizer worries about these semantics and edge cases so you don't have to.




            Bonus: Inner Apply Plan



            SQL Server can produce an inner apply plan (not an inner join plan!) for the example query, it just chooses not to for cost reasons. The cost of the outer join plan shown in the question is 0.02898 units on my laptop's SQL Server 2017 instance.



            You can force an apply (correlated join) plan using undocumented and unsupported trace flag 9114 (which disables ApplyHandler etc.) just for illustration:



            SELECT *
            FROM #MyTable AS mt
            CROSS APPLY
            (
            SELECT COUNT_BIG(DISTINCT mt2.Col_B) AS dc
            FROM #MyTable AS mt2
            WHERE mt2.Col_A = mt.Col_A
            --GROUP BY mt2.Col_A
            ) AS ca
            OPTION (QUERYTRACEON 9114);


            This produces an apply nested loops plan with a lazy index spool. The total estimated cost is 0.0463983 (higher than the selected plan):



            Index Spool apply plan



            Note that the execution plan using apply nested loops produces correct results using "inner join" semantics regardless of the presence of the GROUP BY clause.



            In the real world, we would typically have an index to support a seek on the inner side of the apply to encourage SQL Server to choose this option naturally, for example:



            CREATE INDEX i ON #MyTable (Col_A, Col_B);


            db<>fiddle demo






            share|improve this answer















            Summary



            SQL Server uses the correct join (inner or outer) and adds projections where necessary to honour all the semantics of the original query when performing internal translations between apply and join.



            The differences in the plans can all be explained by the different semantics of aggregates with and without a group by clause in SQL Server.




            Details



            Join vs Apply



            We will need to be able to distinguish between an apply and a join:




            • Apply



              The inner (lower) input of the apply is run for each row of the outer (upper) input, with one or more inner side parameter values provided by the current outer row. The overall result of the apply is the combination (union all) of all the rows produced by the parameterized inner side executions. The presence of parameters means apply is sometimes referred to as a correlated join.



              An apply is always implemented in execution plans by the Nested Loops operator. The operator will have an Outer References property rather than join predicates. The outer references are the parameters passed from the outer side to the inner side on each iteration of the loop.




            • Join



              A join evaluates its join predicate at the join operator. The join may generally be implemented by Hash Match, Merge, or Nested Loops operators in SQL Server.



              When Nested Loops is chosen, it can be distinguished from an apply by the lack of Outer References (and usually the presence of a join predicate). The inner input of a join never references values from the outer input - the inner side is still executed once for each outer row, but inner side executions do not depend on any values from the current outer row.



            For more details see my post Apply versus Nested Loops Join.




            ...why is there an outer join in the execution plan instead of an inner join?




            The outer join arises when the optimizer transforms an apply to a join (using a rule called ApplyHandler) to see if it can find a cheaper join-based plan. The join is required to be an outer join for correctness when the apply contains a scalar aggregate. An inner join would not be guaranteed to produce the same results as the original apply as we will see.



            Scalar and Vector Aggregates



            • An aggregate without a corresponding GROUP BY clause is a scalar aggregate.

            • An aggregate with a corresponding GROUP BY clause is a vector aggregate.

            In SQL Server, a scalar aggregate will always produce a row, even if it is given no rows to aggregate. For example, the scalar COUNT aggregate of no rows is zero. A vector COUNT aggregate of no rows is the empty set (no rows at all).



            The following toy queries illustrate the difference. You can also read more about scalar and vector aggregates in my article Fun with Scalar and Vector Aggregates.



            -- Produces a single zero value
            SELECT COUNT_BIG(*) FROM #MyTable AS MT WHERE 0 = 1;

            -- Produces no rows
            SELECT COUNT_BIG(*) FROM #MyTable AS MT WHERE 0 = 1 GROUP BY ();


            db<>fiddle demo



            Transforming apply to join



            I mentioned before that the join is required to be an outer join for correctness when the original apply contains a scalar aggregate. To show why this is the case in detail, I will use a simplified example of the question query:



            DECLARE @A table (A integer NULL, B integer NULL);
            DECLARE @B table (A integer NULL, B integer NULL);

            INSERT @A (A, B) VALUES (1, 1);
            INSERT @B (A, B) VALUES (2, 2);

            SELECT * FROM @A AS A
            CROSS APPLY (SELECT c = COUNT_BIG(*) FROM @B AS B WHERE B.A = A.A) AS CA;


            The correct result for column c is zero, because the COUNT_BIG is a scalar aggregate. When translating this apply query to join form, SQL Server generates an internal alternative that would look similar to the following if it were expressed in T-SQL:



            SELECT A.*, c = COALESCE(J1.c, 0)
            FROM @A AS A
            LEFT JOIN
            (
            SELECT B.A, c = COUNT_BIG(*)
            FROM @B AS B
            GROUP BY B.A
            ) AS J1
            ON J1.A = A.A;


            To rewrite the apply as an uncorrelated join, we have to introduce a GROUP BY in the derived table (otherwise there could be no A column to join on). The join has to be an outer join so each row from table @A continues to produce a row in the output. The left join will produce a NULL for column c when the join predicate does not evaluate to true. That NULL needs to be translated to zero by COALESCE to complete a correct transformation from apply.



            The demo below shows how both outer join and COALESCE are required to produce the same results using join as the original apply query:



            db<>fiddle demo



            With the GROUP BY




            ...why does uncommenting the group by clause result in an inner join?




            Continuing the simplified example, but adding a GROUP BY:



            DECLARE @A table (A integer NULL, B integer NULL);
            DECLARE @B table (A integer NULL, B integer NULL);

            INSERT @A (A, B) VALUES (1, 1);
            INSERT @B (A, B) VALUES (2, 2);

            -- Original
            SELECT * FROM @A AS A
            CROSS APPLY
            (SELECT c = COUNT_BIG(*) FROM @B AS B WHERE B.A = A.A GROUP BY B.A) AS CA;



            The COUNT_BIG is now a vector aggregate, so the correct result for an empty input set is no longer zero, it is no row at all. In other words, running the statements above produces no output.



            These semantics are much easier to honour when translating from apply to join, since CROSS APPLY naturally rejects any outer row that generates no inner side rows. We can therefore safely use an inner join now, with no extra expression projection:



            -- Rewrite
            SELECT A.*, J1.c
            FROM @A AS A
            JOIN
            (
            SELECT B.A, c = COUNT_BIG(*)
            FROM @B AS B
            GROUP BY B.A
            ) AS J1
            ON J1.A = A.A;


            The demo below shows that the inner join rewrite produces the same results as the original apply with vector aggregate:



            db<>fiddle demo



            The optimizer happens to choose a merge inner join with the small table because it finds a cheap join plan quickly (good enough plan found). The cost based optimizer may go on to rewrite the join back to an apply - perhaps finding a cheaper apply plan, as it will here if a loop join or forceseek hint is used - but it is not worth the effort in this case.



            Notes



            The simplified examples use different tables with different contents to show the semantic differences more clearly.



            One could argue that the optimizer ought to be able to reason about a self-join not being capable generating any mismatched (non-joining) rows, but it does not contain that logic today. Accessing the same table multiple times in a query is not guaranteed to produce the same results in general anyway, depending on isolation level and concurrent activity.



            The optimizer worries about these semantics and edge cases so you don't have to.




            Bonus: Inner Apply Plan



            SQL Server can produce an inner apply plan (not an inner join plan!) for the example query, it just chooses not to for cost reasons. The cost of the outer join plan shown in the question is 0.02898 units on my laptop's SQL Server 2017 instance.



            You can force an apply (correlated join) plan using undocumented and unsupported trace flag 9114 (which disables ApplyHandler etc.) just for illustration:



            SELECT *
            FROM #MyTable AS mt
            CROSS APPLY
            (
            SELECT COUNT_BIG(DISTINCT mt2.Col_B) AS dc
            FROM #MyTable AS mt2
            WHERE mt2.Col_A = mt.Col_A
            --GROUP BY mt2.Col_A
            ) AS ca
            OPTION (QUERYTRACEON 9114);


            This produces an apply nested loops plan with a lazy index spool. The total estimated cost is 0.0463983 (higher than the selected plan):



            Index Spool apply plan



            Note that the execution plan using apply nested loops produces correct results using "inner join" semantics regardless of the presence of the GROUP BY clause.



            In the real world, we would typically have an index to support a seek on the inner side of the apply to encourage SQL Server to choose this option naturally, for example:



            CREATE INDEX i ON #MyTable (Col_A, Col_B);


            db<>fiddle demo







            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Jun 8 at 13:24

























            answered Jun 5 at 20:46









            Paul WhitePaul White

            56.2k14295469




            56.2k14295469












            • "When Nested Loops is chosen, it can be distinguished from an apply by the lack of Outer References (and usually the presence of a join predicate)." Are you sure? If I write a simple inner join query then then the nested loop operator shows an outer reference. "The inner input of a join never references values from the outer input" Surely by definition the inner loop branch must always reference the outer input, unless the loop join is a cross join?

              – knuckles
              Jun 6 at 14:23






            • 1





              @knuckles Yes I am sure. I wrote Apply versus Nested Loops Join to address your questions and added a link to it to my answer. Cheers.

              – Paul White
              Jun 8 at 13:26

















            • "When Nested Loops is chosen, it can be distinguished from an apply by the lack of Outer References (and usually the presence of a join predicate)." Are you sure? If I write a simple inner join query then then the nested loop operator shows an outer reference. "The inner input of a join never references values from the outer input" Surely by definition the inner loop branch must always reference the outer input, unless the loop join is a cross join?

              – knuckles
              Jun 6 at 14:23






            • 1





              @knuckles Yes I am sure. I wrote Apply versus Nested Loops Join to address your questions and added a link to it to my answer. Cheers.

              – Paul White
              Jun 8 at 13:26
















            "When Nested Loops is chosen, it can be distinguished from an apply by the lack of Outer References (and usually the presence of a join predicate)." Are you sure? If I write a simple inner join query then then the nested loop operator shows an outer reference. "The inner input of a join never references values from the outer input" Surely by definition the inner loop branch must always reference the outer input, unless the loop join is a cross join?

            – knuckles
            Jun 6 at 14:23





            "When Nested Loops is chosen, it can be distinguished from an apply by the lack of Outer References (and usually the presence of a join predicate)." Are you sure? If I write a simple inner join query then then the nested loop operator shows an outer reference. "The inner input of a join never references values from the outer input" Surely by definition the inner loop branch must always reference the outer input, unless the loop join is a cross join?

            – knuckles
            Jun 6 at 14:23




            1




            1





            @knuckles Yes I am sure. I wrote Apply versus Nested Loops Join to address your questions and added a link to it to my answer. Cheers.

            – Paul White
            Jun 8 at 13:26





            @knuckles Yes I am sure. I wrote Apply versus Nested Loops Join to address your questions and added a link to it to my answer. Cheers.

            – Paul White
            Jun 8 at 13:26













            -3














            Cross Apply is a logical operation on the data. When deciding how to get that data SQL Server chooses the appropriate physical operator to get the data you want.



            There is no physical apply operator and SQL Server translates it into the appropriate and hopefully efficient join operator.



            You can find a list of the physical operators in the link below.



            https://docs.microsoft.com/en-us/sql/relational-databases/showplan-logical-and-physical-operators-reference?view=sql-server-2017




            The query optimizer creates a query plan as a tree consisting of logical operators. After the query optimizer creates the plan, the query optimizer chooses the most efficient physical operator for each logical operator. The query optimizer uses a cost-based approach to determine which physical operator will implement a logical operator.



            Usually, a logical operation can be implemented by multiple physical
            operators. However, in rare cases, a physical operator can implement
            multiple logical operations as well.




            edit/ It seems I understood your question wrong. SQL server will normally choose the most appropriate operator. Your query doesn't need to return values for all combinations of both tables which is when a cross join would be used. Just calculating the value you want for each row suffices which is what is done here.






            share|improve this answer



























              -3














              Cross Apply is a logical operation on the data. When deciding how to get that data SQL Server chooses the appropriate physical operator to get the data you want.



              There is no physical apply operator and SQL Server translates it into the appropriate and hopefully efficient join operator.



              You can find a list of the physical operators in the link below.



              https://docs.microsoft.com/en-us/sql/relational-databases/showplan-logical-and-physical-operators-reference?view=sql-server-2017




              The query optimizer creates a query plan as a tree consisting of logical operators. After the query optimizer creates the plan, the query optimizer chooses the most efficient physical operator for each logical operator. The query optimizer uses a cost-based approach to determine which physical operator will implement a logical operator.



              Usually, a logical operation can be implemented by multiple physical
              operators. However, in rare cases, a physical operator can implement
              multiple logical operations as well.




              edit/ It seems I understood your question wrong. SQL server will normally choose the most appropriate operator. Your query doesn't need to return values for all combinations of both tables which is when a cross join would be used. Just calculating the value you want for each row suffices which is what is done here.






              share|improve this answer

























                -3












                -3








                -3







                Cross Apply is a logical operation on the data. When deciding how to get that data SQL Server chooses the appropriate physical operator to get the data you want.



                There is no physical apply operator and SQL Server translates it into the appropriate and hopefully efficient join operator.



                You can find a list of the physical operators in the link below.



                https://docs.microsoft.com/en-us/sql/relational-databases/showplan-logical-and-physical-operators-reference?view=sql-server-2017




                The query optimizer creates a query plan as a tree consisting of logical operators. After the query optimizer creates the plan, the query optimizer chooses the most efficient physical operator for each logical operator. The query optimizer uses a cost-based approach to determine which physical operator will implement a logical operator.



                Usually, a logical operation can be implemented by multiple physical
                operators. However, in rare cases, a physical operator can implement
                multiple logical operations as well.




                edit/ It seems I understood your question wrong. SQL server will normally choose the most appropriate operator. Your query doesn't need to return values for all combinations of both tables which is when a cross join would be used. Just calculating the value you want for each row suffices which is what is done here.






                share|improve this answer













                Cross Apply is a logical operation on the data. When deciding how to get that data SQL Server chooses the appropriate physical operator to get the data you want.



                There is no physical apply operator and SQL Server translates it into the appropriate and hopefully efficient join operator.



                You can find a list of the physical operators in the link below.



                https://docs.microsoft.com/en-us/sql/relational-databases/showplan-logical-and-physical-operators-reference?view=sql-server-2017




                The query optimizer creates a query plan as a tree consisting of logical operators. After the query optimizer creates the plan, the query optimizer chooses the most efficient physical operator for each logical operator. The query optimizer uses a cost-based approach to determine which physical operator will implement a logical operator.



                Usually, a logical operation can be implemented by multiple physical
                operators. However, in rare cases, a physical operator can implement
                multiple logical operations as well.




                edit/ It seems I understood your question wrong. SQL server will normally choose the most appropriate operator. Your query doesn't need to return values for all combinations of both tables which is when a cross join would be used. Just calculating the value you want for each row suffices which is what is done here.







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Jun 5 at 14:13









                J. MaesJ. Maes

                1384




                1384



























                    draft saved

                    draft discarded
















































                    Thanks for contributing an answer to Database Administrators Stack Exchange!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid


                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.

                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f239865%2fcross-apply-produces-outer-join%23new-answer', 'question_page');

                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    How to write a 12-bar blues melodyI-IV-V blues progressionHow to play the bridges in a standard blues progressionHow does Gdim7 fit in C# minor?question on a certain chord progressionMusicology of Melody12 bar blues, spread rhythm: alternative to 6th chord to avoid finger stretchChord progressions/ Root key/ MelodiesHow to put chords (POP-EDM) under a given lead vocal melody (starting from a good knowledge in music theory)Are there “rules” for improvising with the minor pentatonic scale over 12-bar shuffle?Confusion about blues scale and chords

                    Esgonzo ibérico Índice Descrición Distribución Hábitat Ameazas Notas Véxase tamén "Acerca dos nomes dos anfibios e réptiles galegos""Chalcides bedriagai"Chalcides bedriagai en Carrascal, L. M. Salvador, A. (Eds). Enciclopedia virtual de los vertebrados españoles. Museo Nacional de Ciencias Naturales, Madrid. España.Fotos

                    Cegueira Índice Epidemioloxía | Deficiencia visual | Tipos de cegueira | Principais causas de cegueira | Tratamento | Técnicas de adaptación e axudas | Vida dos cegos | Primeiros auxilios | Crenzas respecto das persoas cegas | Crenzas das persoas cegas | O neno deficiente visual | Aspectos psicolóxicos da cegueira | Notas | Véxase tamén | Menú de navegación54.054.154.436928256blindnessDicionario da Real Academia GalegaPortal das Palabras"International Standards: Visual Standards — Aspects and Ranges of Vision Loss with Emphasis on Population Surveys.""Visual impairment and blindness""Presentan un plan para previr a cegueira"o orixinalACCDV Associació Catalana de Cecs i Disminuïts Visuals - PMFTrachoma"Effect of gene therapy on visual function in Leber's congenital amaurosis"1844137110.1056/NEJMoa0802268Cans guía - os mellores amigos dos cegosArquivadoEscola de cans guía para cegos en Mortágua, PortugalArquivado"Tecnología para ciegos y deficientes visuales. Recopilación de recursos gratuitos en la Red""Colorino""‘COL.diesis’, escuchar los sonidos del color""COL.diesis: Transforming Colour into Melody and Implementing the Result in a Colour Sensor Device"o orixinal"Sistema de desarrollo de sinestesia color-sonido para invidentes utilizando un protocolo de audio""Enseñanza táctil - geometría y color. Juegos didácticos para niños ciegos y videntes""Sistema Constanz"L'ocupació laboral dels cecs a l'Estat espanyol està pràcticament equiparada a la de les persones amb visió, entrevista amb Pedro ZuritaONCE (Organización Nacional de Cegos de España)Prevención da cegueiraDescrición de deficiencias visuais (Disc@pnet)Braillín, un boneco atractivo para calquera neno, con ou sen discapacidade, que permite familiarizarse co sistema de escritura e lectura brailleAxudas Técnicas36838ID00897494007150-90057129528256DOID:1432HP:0000618D001766C10.597.751.941.162C97109C0155020