TY - GEN
T1 - double PT
T2 - 35th IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2023
AU - Wang, Lu
AU - LAW, KA LUN, EDDIE
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - With meta-learning, models are trained on multiple tasks, and resulting trained models are expected to be capable of ' learning' new tasks effectively. MAML (Model Agnostic Meta-Learning) was one such early design which allowed models to reuse learned features, but with limited performance. Pre-training is another known method to improve the performance of a final trained model. Pre-train starts by assisting models to seek better initialization points, thus offering better feature representations. In this paper, we propose doublePT (double-pretrain) which is a two-stage pre-training method with goals to 1) reduce the number of parameters, 2) acquire better feature representations, and 3) achieve competitive overall performances in different benchmark measurements. For the first stage operation, we use a universal pre-training model to capture general features from a large dataset. In the second stage, we propose to use MAML to fine-tune the pre-training model to enhance the feature adaptability. Since the first-stage pre-training model has already learned general feature representations, it reduces the training activities for the second-stage fine-tuning operations, and enables better feature extractions in new tasks. Validated through our experiments, we find that our proposed doublePT approach fine-tunes across different tasks, and performs better than that of one-stage pre-training approach. Upon combining doublePT and DINOv2, and comparing to the latest PMF meta-learning design, the number of parameters required by the PMF pipelining model needs 304.8% more parameters than in our proposed DINOv2+doublePT model design. Performance-wise, the DINOv2+doublePT also has the best accuracies across different benchmark measurements.
AB - With meta-learning, models are trained on multiple tasks, and resulting trained models are expected to be capable of ' learning' new tasks effectively. MAML (Model Agnostic Meta-Learning) was one such early design which allowed models to reuse learned features, but with limited performance. Pre-training is another known method to improve the performance of a final trained model. Pre-train starts by assisting models to seek better initialization points, thus offering better feature representations. In this paper, we propose doublePT (double-pretrain) which is a two-stage pre-training method with goals to 1) reduce the number of parameters, 2) acquire better feature representations, and 3) achieve competitive overall performances in different benchmark measurements. For the first stage operation, we use a universal pre-training model to capture general features from a large dataset. In the second stage, we propose to use MAML to fine-tune the pre-training model to enhance the feature adaptability. Since the first-stage pre-training model has already learned general feature representations, it reduces the training activities for the second-stage fine-tuning operations, and enables better feature extractions in new tasks. Validated through our experiments, we find that our proposed doublePT approach fine-tunes across different tasks, and performs better than that of one-stage pre-training approach. Upon combining doublePT and DINOv2, and comparing to the latest PMF meta-learning design, the number of parameters required by the PMF pipelining model needs 304.8% more parameters than in our proposed DINOv2+doublePT model design. Performance-wise, the DINOv2+doublePT also has the best accuracies across different benchmark measurements.
KW - double-pre-train
KW - few-shot learning
KW - meta-learning
KW - meta-pre-train
UR - http://www.scopus.com/inward/record.url?scp=85182406383&partnerID=8YFLogxK
U2 - 10.1109/ICTAI59109.2023.00107
DO - 10.1109/ICTAI59109.2023.00107
M3 - Conference contribution
AN - SCOPUS:85182406383
T3 - Proceedings - International Conference on Tools with Artificial Intelligence, ICTAI
SP - 688
EP - 692
BT - Proceedings - 2023 IEEE 35th International Conference on Tools with Artificial Intelligence, ICTAI 2023
PB - IEEE Computer Society
Y2 - 6 November 2023 through 8 November 2023
ER -