Skip to content

[Complexity] Simplify set32_fx()

Basic info

This is a subtask of #1009 (closed) - it splits off the simplification/complexity optimization of the function set32_fx(). This optimization is BE.

Bug description

The function is overly complicated, since it seems to distinguish between a 32 bit variable where only the 16 LSBs are populated and any other 32 bit variable. It's probably a misinterpretation of the STL 18.10.1, which states:

**18.10.1 Data moves
**Each data move between two 16-bit or two 32-bit variables, move16() and move32() operators

respectively, has a complexity weight of 1.

1. A 16-bit variable cannot be directly moved to a 32-bit or 40-bit variable.

(...)

For above 3 types of moves, functions such as the following ones must be used:

round fx() extract h() extract l() L deposit h() L deposit l()

round40() Extract40 H() Extract40 L() L40 deposit h() L40 deposit l()

L saturate40() L Extract40() L40 deposit32()

There will be no extra weighting for data move when using above functions: the weighting of the data move is already included in the weighting of these functions.

In our understanding, a 32 bit variable is a 32 bit variable, regardless of the value it is actually holding... In addition, the function actually counts the data move twice (L_deposit_l() + move32()).

void set32_fx(
    Word32 y[],     /* i/o: Vector to set                       */
    const Word32 a, /* i  : Value to set the vector to          */
    const Word16 N  /* i  : Lenght of the vector                */
)
{
#ifdef PATCH
    Word16 i;

    FOR( i = 0; i < N; i++ )
    {
        y[i] = a;
        move32();
    }
#else
    Word16 i, tmp;
    tmp = extract_l( a );
    IF( EQ_32( L_deposit_l( tmp ), a ) )
    {
        FOR( i = 0; i < N; i++ )
        {
            y[i] = L_deposit_l( tmp );
            move32();
        }
    }
    ELSE
    {
        FOR( i = 0; i < N; i++ )
        {
            y[i] = a;
            move32();
        }
    }
#endif

    return;
}

A simplification as outlined above is proposed.

Ways to reproduce

See commandline in #1009 (closed) .