USC Repository :: Browsing by Author "Sarria Paja, Milton"

Browsing by Author "Sarria Paja, Milton"

Now showing 1 - 10 of 10

Alternative Measurement Instrument to Evaluate Executive Functions in Children of All Scholar Range
(Institute of Electrical and Electronics Engineers Inc, 2019-04-26) Martínez Ortega, Sara Virginia; Sarria Paja, Milton
Problems related to executive functions can be characterized as weaknesses in a set of mental skills that are key during the learning process. Child neuropsychologists are often asked to determine whether a child experiencing dificulties in the academic setting has a specific learning disability attentional deficit, memory impairment, or some combination of these problems. Typically, the diagnosis is carried out by using a set of standardized questionnaires, which are not friendly and put a lot of stress over the evaluated child. In this work is propose a measurement instrument using computational tools to evaluate the principal dimensions of the executive functions: viso-spatial work memory, phonological work memory, processing speed, inhibition control, planning and organization, and cognitive flexibility, in children of all scholar range (6 to 16 years old). Conclusive results have not be obtained yet, as this is the initial stage of development.
Artificial Intelligence of Behavior for Human Emotion Recognition in Closed Environments
(Institute of Electrical and Electronics Engineers Inc., 2024) Alvarez Garcia, Gonzalo Alberto; Zuniga Canon, Claudia; Garcia Sanchez, Antonio Javier; Garcia Haro, Joan; Sarria Paja, Milton; Asorey Cacheda, Rafael
Understanding human emotions and behavior in closed environments is essential for creating more empathetic and humane spaces. Environmental factors, such as temperature, noise, and light, play a crucial role in influencing behavior, but individuals' emotional states are equally important and often go unnoticed. Artificial Intelligence of Behavior (AIoB) offers a novel approach that integrates environmental measurements with human emotions to create spatially adaptive processes that can influence behavior. In this article, we present a new human emotion sensor developed using video cameras and implemented on a System on Chip (SoC) development board. Our approach uses Convolutional Neural Networks (CNNs) to recognize the presence of emotions in enclosed spaces and generate parameters that can influence emotional states and behavior within an AIoB system. The research successfully integrates advanced CNN technology into a System on Chip (SoC) platform, allowing for real-time processing of video data. The versatility of utilizing an energy-efficient SoC extends its application to smart environments aimed at improving mental health. By employing algorithms capable of detecting emotional states across various individuals, the study enhances its effectiveness. Additionally, it identifies the best CNN operations tailored to the technical specifications of the devices involved. Thus, The development involves a three-step process: (i) collecting enough data to build a robust model, (ii) training the model and evaluating its performance using test values, and (iii) applying the model on the development board. Our study demonstrates the feasibility of using AIoB to recognize and respond to human emotions in closed areas. By integrating emotional cues with environmental measurements, our system can create more personalized and empathetic spaces that cater to the needs of individuals. Our approach could have significant implications for designing public spaces to promote well-being and emotional satisfaction.
Classification of nonverbal human produced audio events: A pilot study
(International Speech Communication Association, 2018-09-06) Bouserhal, Rachel E.; Chabot, Philippe; Sarria Paja, Milton; Cardinal, Patrick; Voix, Jérémie
The accurate classiﬁcation of nonverbal human producedaudio events opens the door to numerous applications beyondhealth monitoring. Voluntary events, such as tongue clickingand teeth chattering, may lead to a novel way of silent interfacecommand. Involuntary events, such as coughing and clearingthe throat, may advance the current state-of-the-art in hearinghealth research. The challenge of such applications is the bal-ance between the processing capabilities of a small intra-auraldevice and the accuracy of classiﬁcation. In this pilot study,10 nonverbal audio events are captured inside the ear canalblocked by an intra-aural device. The performance of three clas-siﬁers is investigated: Gaussian Mixture Model (GMM), Sup-port Vector Machine and Multi-Layer Perceptron. Each classi-ﬁer is trained using three different feature vector structures con-structed using the mel-frequency cepstral (MFCC) coefﬁcientsand their derivatives. Fusion of the MFCCs with the auditory-inspired amplitude modulation features (AAMF) is also investi-gated. Classiﬁcation is compared between binaural and monau-ral training sets as well as for noisy and clean conditions. Thehighest accuracy is achieved at 75.45% using the GMM classi-ﬁer with the binaural MFCC+AAMF clean training set. Accu-racy of 73.47% is achieved by training and testing the classiﬁerwith the binaural clean and noisy dataset.
Design, implementation, and testing of an energy consumption management system applied in Internet protocol data networks
(2021) Velez Varela, Fernando; Marin Lozano, Diego Fernando; Sarria Paja, Milton
To model the power measurement conditions of an energy system, an initial reference should be made within the specification of a generic energy consumption (EC) model. This EC model is based on the energy state of a system comprising goods, networks, and services, which supports energy management capabilities. Therefore, a prototype tool to record power consumption (PC) at the device level is designed and implemented, thereby enabling devise use optimization. The core is to collect energy-related information in internet protocol networks that support the simple network management protocol, in addition to the standard proposed in RFC-7460, and the green information technology. The information obtained is related to the energy consumed by devices that support those technologies. Then, the information is processed to evaluate different PC characteristics of the network. This research impacts the knowledge gap of information system management, where the objective sought is to establish metrics and methodological schemes for obtaining data regarding measurable behaviors in the EC field.
Estimating noise pollution caused by vehicular traffic in an institution of higher education in the city of Cali
(Institute of Electrical and Electronics Engineers Inc., 2019-06-06) Zamorano Narváez, Larry; Torres Rentería, Miguel; Díaz, Maria Fernanda; Sarria Paja, Milton; Ochoa, Jonathan
Noise pollution can affect human health, according to research done by the World Health Organization (WHO), the European Union and the Spanish National Research Council, noise pollution can cause temporary deaf, hypertension Symptoms, and affects concentration and learning processes. In this paper, we estimate noise levels affecting a pilot area in an institution of higher education located in Cali - Colombia. We used the simulation of noise maps method, in which the cartographic information of the study area is analyzed using the ArcGIS ® software, next, results are feed to the software SOUNDPLAN ® with the French model "NMPB-Routes-2008". Results indicate that sound pressure levels are those established by law (resolution 0627 issued by the Environment, Housing and Territorial Development Ministry ) in Colombia for type B sectors, which is the area where the institution is located.
Fusion of auditory inspired amplitude modulation spectrum and cepstral features for whispered and normal speech speaker verification
(Academic Press, 2017-03-25) Sarria Paja, Milton; Falk, Tiago H.
Whispered speech is a natural speaking style that despite its reduced perceptibility, still contains relevant information regarding the intended message (i.e., intelligibility), as well as the speaker identity and gender. Given the acoustic differences between whispered and normally-phonated speech, however, speech applications trained on the latter but tested with the former exhibit unacceptable performance levels. Within an automated speaker verification task, previous research has shown that i) conventional features (e.g., mel-frequency cepstral coefficients, MFCCs) do not convey sufficient speaker discrimination cues across the two vocal efforts, and ii) multi-condition training, while improving the performance for whispered speech, tends to deteriorate the performance for normal speech. In this paper, we aim to tackle both shortcomings by proposing three innovative features, which when fused at the score level, are shown to result in reliable results for both normal and whispered speech. Overall, relative improvements of 66% and 63% are obtained for whispered and normal speech, respectively, over a baseline system based on MFCCs and multi-condition training.
Fusion of bottleneck, spectral and modulation spectral features for improved speaker verification of neutral and whispered speech
(Elsevier B.V., 2018-07-27) Sarria Paja, Milton; Falk, Tiago H.
Speech based biometrics is becoming a preferred method of identity management amongst users and companies. Current state-of-the-art speaker verification (SV) systems, however, are known to be strongly dependent on the condition of the speech material provided as input, and can be affected by unexpected variability presented during testing, such as with environmental noise or changes in vocal effort. In this paper, SV using whispered speech is explored, as whispered speech is known to be a natural speaking style with reduced perceptibility but containing relevant information regarding speaker identity and gender. We propose to fuse information from spectral, modulation spectral and so-called bottleneck features computed via deep neural networks at the feature- and score-levels. Bottleneck features have been recently shown to provide robustness against train/test mismatch conditions and have yet to be tested for whispered speech. Experimental results showed that relative improvements as high as 79% and 60% could be achieved for neutral and whispered speech, respectively, relative to a baseline system trained with i-vectors extracted from mel frequency cepstral coefficients. Results from our fusion experiments, show that the proposed strategies allow to efficiently use the limited resources available and to result in whispered speech performance inline with that obtained with normal speech.
Parkinson's disease detection using modulation components in speech signals
(Institute of Electrical and Electronics Engineers Inc., 2019-06-06) Moofarry, Jhon F.; Sarria Paja, Milton; Orozco Arroyave, J.R.
Parkinson's disease (PD) is the second most prevalent neurodegenerative disorder after Alzheimer's. This disorder affects around 2% of elderly population. In Colombia, the prevalence of Parkinson's disease is around 172 cases per 100.000 inhabitants. Furthermore, around 89% of people diagnosed with PD also suffer from speech disorders. This has motivated many advances in speech signal processing for PD patients which allows to perform assisted diagnosis and also monitor the progression of the disease. In this paper, we propose to use slow varying information from speech signals, also known as modulation components, and combine it with an approach to effectively reduce the number of features to be used in a classification system. The proposed approach achieves around 90% accuracy, outperforming the classical mel-frequency cepstral coefficients (MFCC) approach. Results show that information in slow varying components is highly discriminative to support assisted diagnosis for PD.
Smartphones dependency risk analysis using machine-learning predictive models
(2022-12) Giraldo Jiménez, Claudia Fernanda; Gaviria Chavarro, Javier; Sarria Paja, Milton; Bermeo Varón, Leonardo Antonio; Villarejo Mayor, John Jairo; Rodacki, André Luiz Felix
Recent technological advances have changed how people interact, run businesses, learn, and use their free time. The advantages and facilities provided by electronic devices have played a major role. On the other hand, extensive use of such technology also has adverse efects on several aspects of human life (e.g., the development of societal sedentary lifestyles and new addictions). Smartphone dependency is new addiction that primarily afects the young population. The consequences may negatively impact mental and physical health (e.g., lack of attention or local pain). Health professionals rely on self-reported subjective information to assess the dependency level, requiring specialists’ opinions to diagnose such a dependency. This study proposes a data-driven prediction model for smartphone dependency based on machine learning techniques using an analytical retrospective case–control approach. Diferent classifcation methods were applied, including classical and modern machine learning models. Students from a private university in Cali—Colombia (n= 1228) were tested for (i) smartphone dependency, (ii) musculoskeletal symptoms, and (iii) the Risk Factors Questionnaire. Random forest, logistic regression, and support vector machine-based classifers exhibited the highest prediction accuracy, 76–77%, for smartphone dependency, estimated through the stratifed-k-fold cross-validation technique. Results showed that self-reported information provides insight into predicting smartphone dependency correctly. Such an approach opens doors for future research aiming to include objective measures to increase accuracy and help to reduce the negative consequences of this new addiction form.
Variants of mel-frequency cepstral coefficients for improved whispered speech speaker verification in mismatched conditions
(Institute of Electrical and Electronics Engineers Inc., 2017-10-26) Sarria Paja, Milton; Falk, Tiago H.
In this paper, automatic speaker verification using normal and whispered speech is explored. Typically, for speaker verification systems, varying vocal effort inputs during the testing stage significantly degrades system performance. Solutions such as feature mapping or addition of multi-style data during training and enrollment stages have been proposed but do not show similar advantages for the involved speaking styles. Herein, we focus attention on the extraction of invariant speaker-dependent information from normal and whispered speech, thus allowing for improved multi vocal effort speaker verification. We base our search on previously reported perceptual and acoustic insights and propose variants of the mel-frequency cepstral coefficients (MFCC). We show the complementarity of the proposed features via three fusion schemes. Gains as high as 39% and 43% can be achieved for normal and whispered speech, respectively, relative to the existing systems based on conventional MFCC features.

Browsing by Author "Sarria Paja, Milton"

Results Per Page

Sort Options