In NLP tasks, transformer models perform well in multiple tasks due to their self-attention mechanism and parallel computing capability. Which of the following statements about transformer models are true?
In cases where the bright and dark areas of an image are too extreme, which of the following techniques can be used to improve the image?
How many parameters need to be learned when a 3 × 3 convolution kernel is used to perform the convolution operation on two three-channel color images?
Mel-frequency cepstral coefficients (MFCCs) take into account human auditory characteristics by first mapping the linear spectrum to the Mel nonlinear spectrum based on auditory perception, and then converting it to the cepstral domain.
Maximum likelihood estimation (MLE) can be used for parameter estimation in a Gaussian mixture model (GMM).
In the field of deep learning, which of the following activation functions has a derivative not greater than 0.5?
When the chi-square test is used for feature selection, SelectKBest and _____ function or class must be imported from the sklearn.feature_selection module. (Enter the function interface name.)
Overfitting is a condition where a model is overly simple and excessive generalization errors occur.
Which of the following statements about the functions of the encoder and decoder is true?
The technologies underlying ModelArts support a wide range of heterogeneous compute resources, allowing you to flexibly use the resources that fit your needs.
If a scanned document is not properly placed, and the text is tilted, it is difficult to recognize the characters in the document. Which of the following techniques can be used for correction in this case?