Potential MASA metadata framing asynchrony
In MASA encoding, when the metadata frames contain multiple consecutive sub-frames with similar values, the IVAS encoder can prefer high frequency resolution over high temporal resolution with a comparable bitrate (e.g., at 96 kbps, 18 bands + 1 sub-frame vs. 5 bands + 4 sub-frames). In some situations, the encoding framing of IVAS is not aligned with the MASA metadata framing, leading the encoding frame to contain dissimilar metadata sub-frames even when the underlying data contains similar sub-frames. As a consequence, the encoding prefers high temporal resolution at the cost of frequency resolution leading into reduced perceptual output quality.