Complexity of algorithms: Difference between revisions

From Citizendium
Jump to navigation Jump to search
imported>Nick Johnson
(fixed broken math markup)
mNo edit summary
 
(28 intermediate revisions by 13 users not shown)
Line 1: Line 1:
In [[computer science]], '''Big O Notation''' is a [[notation]] for expressing the worst-case [[execution time]] for an [[algorithm]]. 
{{subpages}}


It is generally written as <math>O(f(x_1, x_2, \ldots, x_n))</math>, where <math>f</math> is a [[function]] of zero or more parameters <math>x_i</math> of the systemIt roughly states that in the [[asymptote|aymptotic case]] (i.e. as one or more of the system parameters extend towards their extremes), the actual execution time of the algorithm is bounded by <math>K\times f(x_1,x_2,\ldots,x_n)</math> for some positive constant K.
In [[computer science]],  the '''complexity of an algorithm''' is a way to classify how efficient an algorithm is, compared to alternative onesThe focus is on how [[execution time]] increases with the data set to be processed.


Big O Notation is attractive to computer scientists because it expresses the complexity of an algorithm without restricting the statement to a particular [[computer architecture]] or time period; even if computers get faster, or if the algorithm is switched from a PC to a Apple, the worst case execution time of an algorithm does not change.  Additionally, when comparing two algorithms, it is readily apparent which will execute faster in the extreme case.


==Informal definition==
The complexity of a given algorithm is expressed by the worst-case [[execution time]], using the [[big O notation]]. It is generally written as <math>O(f(x_1, x_2, \ldots, x_n))</math>, where <math>f</math> is a [[function]] of one or more parameters <math>x_i</math> of the system.  The "O" is read "order" as in "order of magnitude" or "on the order of". It roughly states that in the [[asymptote|asymptotic case]] (i.e. as one or more of the system parameters extend towards their extremes), the actual execution time of the algorithm is bounded by <math>K\times f(x_1,x_2,\ldots,x_n)</math> for some positive constant K.
Sometimes one is interested in expected or best case execution time as well,  or even the algorithms requirements of memory or other resources.
The big O notation expresses the complexity of an algorithm without restricting the statement to a particular [[computer architecture]] or time period; even if computers get faster, or if the algorithm is ported to another software platform, the worst case execution time of an algorithm does not change.  Additionally, when comparing two algorithms, it is readily apparent which will execute faster in the extreme case.
==Classes of complexity==
Most commonly, one will see
Most commonly, one will see


Line 10: Line 17:
** <math>O(1)</math> --- Constant time --- the time necessary to perform the algorithm does not change in response to the size of the problem.  
** <math>O(1)</math> --- Constant time --- the time necessary to perform the algorithm does not change in response to the size of the problem.  
** <math>O(n)</math> --- Linear time --- the time grows linearly with the size (n) of the problem.
** <math>O(n)</math> --- Linear time --- the time grows linearly with the size (n) of the problem.
** <math>O(n^2)</math> --- Quadratic time --- the time grows quadratically with the size (n) of the problem.  Please note that, by convention, one only writes the highest-degree term for polynomial time.
** <math>O(n^2)</math> --- Quadratic time --- the time grows quadratically with the size (n) of the problem.  In [[big O notation]], all polynomials with the same degree are equivalent, so <math>O(3n^2 + 3n + 7)=O(n^2)</math>
* Sub-polynomial time algorithms (grow slower than polynomial time algorithms).
* Sub-linear time algorithms (grow slower than linear time algorithms).
** <math>O(log n)</math> -- Logarithmic time
** <math>O(\log n)</math> -- Logarithmic time
* Super-polynomial time algorithms (grow faster than polynomial time algorithms).
* Super-polynomial time algorithms (grow faster than polynomial time algorithms).
** <math>O(n!)</math>
** <math>O(n!)</math>
** <math>O(2^n)</math>
** <math>O(2^n)</math> --- Exponential time --- the time required grows exponentially with the size of the problem.
 
Despite the examples, with this notation one can express execution time as a function of multiple variables.
 
== Expected case, best case ==
 
Similar analysis can be performed to find the expected and best case execution times of algorithms.  The best case is generally written as <math>\Omega( f(x_1, x_2,\ldots, x_n))</math>.
 
== Example of finding the complexity of an algorithm ==
 
 
=== Bubble sort ===
 
Take, for example the [[bubble sort]] algorithm for sorting a list of numbers,
 
<code>
  Procedure: BubbleSort(List[])
  Inputs: List[] - A list of numbers
  Locals: i, j - integers
  Begin:
    For i = 0 to List.Size-1
        For j = i+1 to List.Size-1
          If List[i] > List[j], Then
              Swap List[i] and List[j]
          End If
        Next j
    Next i
</code>
 
The size of the sorting problem is related to the size ''N'' of the list.  As ''N'' increases, we expect the execution time to increase as well.  To find the complexity of this algorithm, we start from the middle and work our way out.
 
The innermost instruction is to swap two list elements.  Regardless of how long the list is, this always takes the same amount of time, and so the swap instruction is O(1) with respect to ''N''.  Surrounding that instruction is an [[if statement]], which either executes its body, or does not; in the worst case, it will execute its body.  Again, the if statement is O(1).
 
The if statement is executed as many times as the inner loop iterates, and the inner loop executes as many times as the outer loop interates.  During the first iteration (''i'' = 0), the inner loop executes ''N'' − 1 times, during the second iteration (''i'' = 1), ''N'' − 2 times, and so on.  So, our complexity for this algorithm is:
 
:<math>
O \bigg( \underbrace{ \sum_{i=0}^{N-1} (N-i-1) }_{\textrm{number-of-loops}} \times (\underbrace{\underbrace{1}_{if} + \underbrace{1}_{swap}}_{\textrm{time-per-loop}}) \bigg)
= O( N^2 - N  ).
</math>
 
The <math>N</math> term is irrelevant in the asymptotic case since it is dominated by the <math>N^2</math> term. So we write the complexity of the bubble sort algorithm as <math>O(N^2)</math>.  The expected and best cases are identical, so bubble sort is <math>\Omega(N^2)</math> and <math>\Theta(N^2)</math>.
 
=== Quick sort ===
 
It is important to understand that complexity as we have discussed here pertains to the algorithm, not to the problem.  As was noted earlier, the bubble sort algorithm has complexity of <math>O(N^2)</math>.  However, sorting a list of numbers can be done more efficiently.  Take, for instance, the [[quick sort]] algorithm:
 
<code>
  Procedure: QuickSort(List[])
  Inputs: List[] - A list of numbers,
  Locals: Left[] - A list of numbers, initially empty
          Right[] - A list of numbers, initially empty
          PivotElement, Elt - numbers
  Begin:
      If List.Size > 1, Then
        Let PivotElement = List[ List.Size / 2 ]
        For each element Elt in List[], do
            If Elt <= PivotElement, Then
              Append Elt to Left[]
            Else
              Append Elt to Right[]
            End If
        Next
        QuickSort(Left[])
        QuickSort(Right[])
        Let List[] = Left[] concatenate Right[]
      End If
</code>
 
This algorithm is more difficult to analyze because it involves recursion, but is nonetheless possible.  Again, we work from the inside out.  Appending an element to a list is O(1), and so is the innermost if statement.  Additionally, list concatenation is O(1)  The pivoting loop executes once for each element in the list, and thus the loop is O(N). 




Despite the examples, Big O Notation can express execution time as a function of multiple examples.
If we assume that, pivoting loop splits List[] into two halves of equal size (i.e. both Left[] and Right[] have N/2 elements) then, the expected complexity of the quick sort algorithm is <math>\Theta(f(n))</math>, where:


<math>
f(n) = \underbrace{N}_{loop} + \underbrace{2\times f({{N}\over{2}})}_{recursion}
= N \log N
</math>


However, if pivoting does not always split the array in half, then the worst case becomes <math>O(N^2)</math>.


== See Also ==
The quick sort algorithm can thus be shown to be more efficient in the expected case.  Once the problem size grows large enough, bubble sort must take more time.  More formally, for any positive constants <math>K_1, K_2</math>, there exists a list length <math>N_0</math> such that:


* [[complexity theory]]
<math>
* [[complexity class]]
\forall N > N_0: K_1 \Theta(N^2) > K_2 \Theta(N \log N)
</math>[[Category:Suggestion Bot Tag]]

Latest revision as of 11:01, 31 July 2024

This article is developing and not approved.
Main Article
Discussion
Related Articles  [?]
Bibliography  [?]
External Links  [?]
Citable Version  [?]
 
This editable Main Article is under development and subject to a disclaimer.

In computer science, the complexity of an algorithm is a way to classify how efficient an algorithm is, compared to alternative ones. The focus is on how execution time increases with the data set to be processed.


Informal definition

The complexity of a given algorithm is expressed by the worst-case execution time, using the big O notation. It is generally written as , where is a function of one or more parameters of the system. The "O" is read "order" as in "order of magnitude" or "on the order of". It roughly states that in the asymptotic case (i.e. as one or more of the system parameters extend towards their extremes), the actual execution time of the algorithm is bounded by for some positive constant K.

Sometimes one is interested in expected or best case execution time as well, or even the algorithms requirements of memory or other resources.

The big O notation expresses the complexity of an algorithm without restricting the statement to a particular computer architecture or time period; even if computers get faster, or if the algorithm is ported to another software platform, the worst case execution time of an algorithm does not change. Additionally, when comparing two algorithms, it is readily apparent which will execute faster in the extreme case.

Classes of complexity

Most commonly, one will see

  • Polynomial time algorithms,
    • --- Constant time --- the time necessary to perform the algorithm does not change in response to the size of the problem.
    • --- Linear time --- the time grows linearly with the size (n) of the problem.
    • --- Quadratic time --- the time grows quadratically with the size (n) of the problem. In big O notation, all polynomials with the same degree are equivalent, so
  • Sub-linear time algorithms (grow slower than linear time algorithms).
    • -- Logarithmic time
  • Super-polynomial time algorithms (grow faster than polynomial time algorithms).
    • --- Exponential time --- the time required grows exponentially with the size of the problem.

Despite the examples, with this notation one can express execution time as a function of multiple variables.

Expected case, best case

Similar analysis can be performed to find the expected and best case execution times of algorithms. The best case is generally written as .

Example of finding the complexity of an algorithm

Bubble sort

Take, for example the bubble sort algorithm for sorting a list of numbers,

 Procedure: BubbleSort(List[])
 Inputs: List[] - A list of numbers
 Locals: i, j - integers
 Begin:
    For i = 0 to List.Size-1
       For j = i+1 to List.Size-1
          If List[i] > List[j], Then
             Swap List[i] and List[j]
          End If
       Next j
    Next i

The size of the sorting problem is related to the size N of the list. As N increases, we expect the execution time to increase as well. To find the complexity of this algorithm, we start from the middle and work our way out.

The innermost instruction is to swap two list elements. Regardless of how long the list is, this always takes the same amount of time, and so the swap instruction is O(1) with respect to N. Surrounding that instruction is an if statement, which either executes its body, or does not; in the worst case, it will execute its body. Again, the if statement is O(1).

The if statement is executed as many times as the inner loop iterates, and the inner loop executes as many times as the outer loop interates. During the first iteration (i = 0), the inner loop executes N − 1 times, during the second iteration (i = 1), N − 2 times, and so on. So, our complexity for this algorithm is:

The term is irrelevant in the asymptotic case since it is dominated by the term. So we write the complexity of the bubble sort algorithm as . The expected and best cases are identical, so bubble sort is and .

Quick sort

It is important to understand that complexity as we have discussed here pertains to the algorithm, not to the problem. As was noted earlier, the bubble sort algorithm has complexity of . However, sorting a list of numbers can be done more efficiently. Take, for instance, the quick sort algorithm:

  Procedure: QuickSort(List[])
  Inputs: List[] - A list of numbers,
  Locals: Left[] - A list of numbers, initially empty
          Right[] - A list of numbers, initially empty
          PivotElement, Elt - numbers
  Begin:
     If List.Size > 1, Then
        Let PivotElement = List[ List.Size / 2 ]
        For each element Elt in List[], do
           If Elt <= PivotElement, Then
              Append Elt to Left[]
           Else
              Append Elt to Right[]
           End If
        Next
        QuickSort(Left[])
        QuickSort(Right[])
        Let List[] = Left[] concatenate Right[]
     End If

This algorithm is more difficult to analyze because it involves recursion, but is nonetheless possible. Again, we work from the inside out. Appending an element to a list is O(1), and so is the innermost if statement. Additionally, list concatenation is O(1) The pivoting loop executes once for each element in the list, and thus the loop is O(N).


If we assume that, pivoting loop splits List[] into two halves of equal size (i.e. both Left[] and Right[] have N/2 elements) then, the expected complexity of the quick sort algorithm is , where:

However, if pivoting does not always split the array in half, then the worst case becomes .

The quick sort algorithm can thus be shown to be more efficient in the expected case. Once the problem size grows large enough, bubble sort must take more time. More formally, for any positive constants , there exists a list length such that: