Speed improvement in power_output.py

While running large datasets, I have noticed that that np.interp function slows down significantly in "power_curve_density_correction" function if the input datatype is pandas DataFrame. 
You can try a workaround like changing the dataframe to numpy array and back to dataframe after the interpolation operation, this can speed upto 5-6x if the dataframe is longer than 8000 timesteps.
This will help speed up large datasets.


```
def power_curve_density_correction(
    wind_speed, power_curve_wind_speeds, power_curve_values, density):
    r"""
    Calculates the turbine power output using a density corrected power curve.
    This function is carried out when the parameter `density_correction` of an
    instance of the :class:`~.modelchain.ModelChain` class is True.
    Parameters
    ----------
    wind_speed : :pandas:`pandas.Series<series>` or numpy.array
        Wind speed at hub height in m/s.
    power_curve_wind_speeds : :pandas:`pandas.Series<series>` or numpy.array
        Wind speeds in m/s for which the power curve values are provided in
        `power_curve_values`.
    power_curve_values : :pandas:`pandas.Series<series>` or numpy.array
        Power curve values corresponding to wind speeds in
        `power_curve_wind_speeds`.
    density : :pandas:`pandas.Series<series>` or numpy.array
        Density of air at hub height in kg/m³.
    Returns
    -------
    :pandas:`pandas.Series<series>` or numpy.array
        Electrical power output of the wind turbine in W.
        Data type depends on type of `wind_speed`.
    Notes
    -----
    The following equation is used for the site specific power curve wind
    speeds [1]_ [2]_ [3]_:
    .. math:: v_{site}=v_{std}\cdot\left(\frac{\rho_0}
                       {\rho_{site}}\right)^{p(v)}
    with:
        .. math:: p=\begin{cases}
                      \frac{1}{3} & v_{std} \leq 7.5\text{ m/s}\\
                      \frac{1}{15}\cdot v_{std}-\frac{1}{6} & 7.5
                      \text{ m/s}<v_{std}<12.5\text{ m/s}\\
                      \frac{2}{3} & \geq 12.5 \text{ m/s}
                    \end{cases},
        v: wind speed [m/s], :math:`\rho`: density [kg/m³]
    :math:`v_{std}` is the standard wind speed in the power curve
    (:math:`v_{std}`, :math:`P_{std}`),
    :math:`v_{site}` is the density corrected wind speed for the power curve
    (:math:`v_{site}`, :math:`P_{std}`),
    :math:`\rho_0` is the ambient density (1.225 kg/m³)
    and :math:`\rho_{site}` the density at site conditions (and hub height).
    It is assumed that the power output for wind speeds above the maximum
    and below the minimum wind speed given in the power curve is zero.
    References
    ----------
    .. [1] Svenningsen, L.: "Power Curve Air Density Correction And Other
            Power Curve Options in WindPRO". 1st edition, Aalborg,
            EMD International A/S , 2010, p. 4
    .. [2] Svenningsen, L.: "Proposal of an Improved Power Curve Correction".
            EMD International A/S , 2010
    .. [3] Biank, M.: "Methodology, Implementation and Validation of a
            Variable Scale Simulation Model for Windpower based on the
            Georeferenced Installation Register of Germany". Master's Thesis
            at Reiner Lemoine Institute, 2014, p. 13
    """
    if density is None:
        raise TypeError(
            "`density` is None. For the calculation with a "
            + "density corrected power curve density at hub "
            + "height is needed."
        )

    #NOTE : CHANGES ARE MADE HERE
    # create a flag for pandas Series type
    Panda_series = False
    
    if isinstance(wind_speed, pd.Series):
        #save the indexes for later conversion to pd.Series
        indexes = wind_speed.index
        # change the wind speed Series to numpy array 
        wind_speed = wind_speed.values
        # Set the panda flag True 
        Panda_series = True

    power_output = [
        (
            np.interp(
                wind_speed[i],
                power_curve_wind_speeds
                * (1.225 / density[i])
                ** (
                    np.interp(
                        power_curve_wind_speeds, [7.5, 12.5], [1 / 3, 2 / 3]
                    )
                ),
                power_curve_values,
                left=0,
                right=0,
            )
        )
        for i in range(len(wind_speed))
    ]

    # Power_output as pd.Series if wind_speed is pd.Series (else: np.array)
    if Panda_series: #use the flag to check
        power_output = pd.Series(
            data=power_output,
            index=indexes, # UUse previously saved indexes
            name="feedin_power_plant",
        )
    else:
        power_output = np.array(power_output)
    return power_output
```




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Speed improvement in power_output.py #106

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Speed improvement in power_output.py #106

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions