note on encoder scaling
Basic info
- Encoder (fixed): 7847d456
Bug description
I start to track the encoder scaling flow and here are some observations:
- Ivas_enc_fx()
○ Scaling all inptu channel from W16.Q0 -> W32.Q11
○ Hp20()
§ Some scaling done only for the function, output sill in W32.Q11 (tbc)
○ ivas_cpe_enc_fx()
i. Copy data_fx >> sts[0]->input32_fx (W32.Q11)
ii. Copy sts[0]->input32_fx>>sts[0]->input (W16.Q0) (now input is hp20 filtered)
iii. select_stereo_mode()
iv. stereo_mode_combined_format_enc_fx()
v. front_vad_fx() -> not used for stereo
vi. stereo_memory_enc_fx()
vii. Copy sts[x]->input_fx->>ts[x]->input32_fx -> W16.Q0 -> W32.Q11
1) -> doesn't seem needed given setp ii. ****and it removes all precision that was kept in input32_fx****
viii. Find maximum scaling factor for sts[1]->input32_fx
1) Scale sts[1]->input32_fx
ix. Find maximum scaling factor for sts[0]->input32_fx
1) Scale sts[0]->input32_fx
x. Find minimum scaling factor between Q_inp and scaled sts[0]->input32_fx and sts[1]->input32_fx
1) Here ******Q_inp seems always be 0, so the minimum of that will result to 0 as well**
2) Re-scaled again sts[0]->input32_fx and sts[1]->input32_fx with the new factor )
a) Basically going from w32.Q11+x -> W32.q0
i) If the goal is to go back to Q0 there, no need to all these scaling
xi. stereo_set_tdm_fx()
xii. Find scaling factor for sts[x]->old_input_signal_fx
1) Scale sts[x]->old_input_signal_fx
xiii. Find scaling factor for sts[x]->input_fx
1) Scale sts[x]->input_fx
xiv. Find the minimum scaling factor for sts[x]->input_fx and sts[x]->old_input_signal_fx
1) Rescale sts[x]->input_fx and sts[x]->old_input_signal_fx
2) Scale sts[0]->buf_speech_enc to the same factor ******Does it cover all the buffer??**
xv. stereo_switching_enc_fx()
xvi. Copy and scale sts[x]->input_buff_fx >> sts[x]->input_buff32_fx W16.q_inp -> W32.Q_inp+6
xvii. stereo_tca_enc_fx()
xviii. Copy and scale sts[x]->input_buff32_fx >> sts[x]->input_buff_fx W32.Q_inp+6 -> W16.Q0
Basically, I only did the beginning and the logic is hard to follow and some precision seems to be lost along the way. Maybe it would be good to revise it