Liangyi Huang

Arizona State University

Address: 1001 E Playa Del Norte Dr, APT 2339, Tempe, AZ, 85288
Phone: +1 440 876 8069

Publications

  1. Liangyi Huang, Xusheng Xiao.
    CTIKG: LLM-Powered Knowledge Graph Construction from Cyber Threat Intelligence.
    The Conference on Language Modeling (COLM 2024), Philadelphia, PA, USA, Oct 2024.
  2. Feng Dong, Shaofei Li, Peng Jiang, Ding Li, Haoyu Wang, Liangyi Huang, Xusheng Xiao, Jiedong Chen, Xiapu Luo, Yao Guo, and Xiangqun Chen
    Are we there yet? An Industrial Viewpoint on Provenance-based Endpoint Detection and Response Tools
    The 30th ACM Conference on Computer and Communications Security (CCS 2023), Copenhagen, Denmark, Nov 2023.
  3. Liangyi Huang, Sophia Hall, Fei Shao, Arafath Nihar, Vipin Chaudhary, Yinghui Wu, Roger French, and Xusheng Xiao
    System-Auditing, Data Analysis and Characteristics of Cyber Attacks for Big Data Systems
    The Conference on Information and Knowledge Management (CIKM), Demo Track, Hybrid Conference, Atlanta, Georgia, USA, 2022.
  4. William C Oltjen, Yangxin Fan, Jiqi Liu, Liangyi Huang, Xuanji Yu, Mengjie Li, Hubert Seigneur, Xusheng Xiao, Kristopher O Davis, Laura S Bruckman, Yinghui Wu, and Roger H French
    FAIRification, Quality Assessment, and Missingness Pattern Discovery for Spatiotemporal Photovoltaic Data
    IEEE Photovoltaics Specialists Conference (PVSC), San Juan, Puerto Rico, 2022.
  5. Yanni Zhao, YingHui Wang, Ningna Wang, Xiaojuan Ning, Zhenghao Shi, Minghua Zhao, Ke Lv, LiangYi Huang.
    A Hole Repairing Method Based on Slicing.
    In International Conference on E-Learning and Games (pp. 123-131). Springer, Cham. (2018, June 28-30).
  6. YingHui Wang, Yanni Zhao, Ningna Wang, Xiaojuan Ning, Zhenghao Shi, Minghua Zhao, Ke Lv, LiangYi Huang.
    A Hole Repairing Method Based on Edge-Preserving Projection.
    In International Conference on E-Learning and Games (pp. 115-122). Springer, Cham. (2018, June 28-30).
  7. Lijuan Wang, YingHui Wang, Ningna Wang, Xiaojuan Ning, Ke Lv, LiangYi Huang.
    A Slice-Guided Method of Indoor Scene Structure Retrieving.
    In International Conference on E-Learning and Games (pp. 192-202). Springer, Cham. (2018, June 28-30).
  8. LiangYi Huang.
    A Preliminary Analysis of A Photovoltaic Solar Chimney Hybrid-power Plant (Chinese).
    Science & Technology Economy Market, pp. 7-8. (2015, Dec).
  9. LiangYi Huang, Fei Cao.
    Experimental Research on Performance of A Photovoltaic Solar Chimney Hybrid-power Plant (Chinese).
    Heilongjiang Science, pp. 16-17. (2016, Jan).

Education

Research Projects

CYBERDOC: Enhanced Cyber Threat Summaries Powered by LLMs and CTI Knowledge Graph

Mar. 2024 - Nov. 2024

  • Motivation: Current cyber threat knowledge bases rely on manual analysis and can't scale to cover all emerging threats and behaviors. General-purpose LLMs and text summarization techniques lack domain-specific focus, making them ineffective for comprehensive cyber threat summaries.
  • Goal: CYBERDOC aims to automatically retrieve articles describing cyber threat behaviors and empower LLMs with security domain knowledge to generate accurate and comprehensive threat summaries. It seeks to overcome the limitations of existing methods by providing scalable and detailed insights into rapidly evolving cyber threats.
  • Approach: CYBERDOC integrates security domain knowledge into LLMs to focus on extracting security-oriented information from collected CTI articles. It constructs a knowledge graph from extracted triples and uses it to generate concise threat summaries, also performing community detection to reveal relationships among correlated security entities.

CTIKG: LLM-Powered Knowledge Graph Construction from Cyber Threat Intelligence

Mar. 2022 - Mar. 2024

  • Motivation: With the increasing complexity of cyberattacks, current methods fail to capture the relationships between security entities in Cyber Threat Intelligence articles. There is a need for automated knowledge graph construction to better analyze these threats.
  • Goal: CTIKG aims to utilize the Large Language Model to construct a security-oriented knowledge graph from CTI articles, revealing relationships between entities such as malware and vulnerabilities.
  • Approach: CTIKG employs prompt engineering, text segmentation, and multiple LLM agents with a dual memory design to extract and summarize triples from CTI articles, building a comprehensive and accurate knowledge graph while addressing LLM challenges like token limitations and hallucinations.

Detection of Cyber Attacks on Big Data Systems

Jan. 2022 - May 2022

  • Motivation: With the rapid growth of cyber attacks, an intelligent detection system is necessary to prevent data-stealing Trojans and data-encrypting ransomware. The research target server cluster stores high-value files and contains a large number of data nodes which have similar data flows and behavior.
  • Goal: Converting the working state of the servers into some feature, distinguishing the normal state and the malicious state by advanced AI models.
  • Approach: Deploying the system auditing tool on server cluster, building the log analysis system which generates graph-based log file, studying the characteristics of collected logs, and implementing a detection system for unexpected file deletion.

FAIRification

Sep. 2021 - May 2021

  • Motivation: The rapid growth of the photovoltaic market presents challenges in managing large, inconsistent datasets from power plants. Efficient methods are needed to address data quality issues and impute missing values to improve the accuracy of power forecasting and long-term reliability assessments.
  • Goal: This research aims to develop a FAIRification framework to automate the ingestion, quality assessment, and missingness imputation of spatiotemporal PV data, enhancing the reusability and interoperability of these datasets for performance analysis.
  • Approach: The framework applies the FAIR principles (Findable, Accessible, Interoperable, Reusable) to standardize data and uses imputation methods such as K-nearest neighbors, linear interpolation, and mean interpolation to handle missing data. KNN was found to be the most effective due to its ability to leverage spatial coherence. The research also explores the potential of spatio-temporal graph neural networks to further enhance data imputation.