Chunqiu Zeng's Homepage

Chunqiu Zeng

Chunqiu Zeng
Ph.D Student, Knowledge Discovery Research Group (KDRG)
School of Computing and Information Science
Florida International University

Contact information
Address: ECS 251, Florida International University,11200 SW 8th St. Miami, FL, 33199 USA
Phone: (305)348-6036 Email: grandzeng(at)gmail(dot)com

General information

Chunqiu is now a software engineer, working on content ads at Google. He received his Ph.D. degree from the School of Computing and Information Science of Florida International University in November of 2016. His dissertation work is about Large Scale Data Mining for IT Service Management . During his Ph.D. study, he was co-advised by Prof. Tao Li and Prof. Shu-Ching Chen.

He used to be a data warehouse software engineer at Alibaba Company, serving the R&D post for infrastructure tools and distributed computing platform of data warehouse from 2009 to 2011.

He received his master's degree in Sichuan University of China in 2009, and his master degree's advisor was Prof. Chanjie Tang.

More detail about him can be found in his CV and Google Scholar.

Research Interests

Chunqiu is interested in large scale data mining, system management, machine learning and database , and also shows great interest in large scale distributed data computing platform including distributed database and cloud computing.


Ph.D Student of Florida International University, majored in Computer Science (December 2011 - Present).

Master Student of Sichuan University, majored in Computer Science (September 2006 - July 2009).

Bachelor Student of Sichuan University , majored in Computer Science (September 2002 - July 2006).

Research Experience

  • Knowledge Discovery Research Group (KDRG), School of Computing and Information Science, FIU, December 2011 - Present

    • BCIN, the project aims at integrating disaster information by building a vertical search engine. In charge of implementing Disaster Real Time Monitoring.

    • FIU-Miner(a Fast,Integrated,User-friendly System for Data Mining in Distributed Environment), This project aims at building a large scale data analysis system with the following functionalities: (1) Integrate complex data mining algorithms and exchange intermediate results among sub tasks (rather than simple data processing in Hadoop); (2) Monitor the system resource consumption in real time based on the trace data; and (3) Balance the workload of nodes in a cluster based on monitored utilization data.

    • System Resource Monitoring and Mining. This project aims at building an integrated framework to monitor the resource utilization of nodes in a cluster. Besides the monitoring, the framework also supports real time queries over the continuously monitored log streams and provides mining modules to discover useful patterns for problem determination.

  • Database and Knowledge Engineering Institute in Sichuan University in China, Research Assistant, September 2006 - July 2009

    • National Birth Defects Supervision Project, supported by the 11-th Five Years Key Programs for Science & Technology. Development of China under grant NO.2006BAI05A01. In charge of OLAP module providing ways for rolling up, drilling down, data slicing and data visualization related to OLAP operations. (2007 - July 2009)

    • Crime Miner: A criminal data mining system based on artificial intelligence, supported by Foundation of Innovation Software Engineering for Young People in Sichuan. In charge of architecture design, implementation of system framework, layer of data access and classifying module based on criminal data. (October 2007 - September 2008)

    • National Earthquake Detection and Reporting System (China), directed by Seismology Bureau of Sichuan Province in 2007. In charge of seismic waveform user graphic interface module, instant event inspecting module, query cache module for waveform data from database and parameter configuration module. (January 2007 - April 2007)

Industrial Experience

  • Google, Intern at DoubleClick Search Ads Reporting Team(May 2015 - July 2015)

    Leveraged data mining and statistics techniques to conduct change impact analysis, with the purpose to identify the correlation between advertising changes and performance metrics. Constructed a flume pipeline for change impact analysis over large scale data.

  • Google, Intern at YouTube AdFormat Team(May 2014 - August 2014)

    Built a tool to integrate multi-sources Ads time series data on YouTube video. Developed an integrated interface to facilitate the users for Ads time series data querying, analysis and visualization. Explored the correlation between time series to identify the factors of QoE which contributes to the quality of Ads.

  • IBM T.J Watson Research Center, Research Intern at Service Management Team (May 2013 - August 2013)

    Proposing and implementing a hierarchical multi-label classification component to improve the determination of the problem types based on the incident tickets generated by IBM Tivoli Monitoring System.

  • Alibaba Company, Senior Data Engineer (July 2009 - December 2011)

    • Job Scheduling System is to schedule more than 10,000 ETL jobs each day rationally and efficiently on different system resources like RAC, Greenplum, Hadoop/Hive and so forth with high resource utilization and large job throughput. Main contribution: split all the resources into different groups; design and implement resource scheduling algorithms to achieve the balance among resource groups and rationally preempt resource in different groups for fully and efficiently utilizing all the available resources. (SQL, Shell, Python).

    • ADS (Automatic Deploying System) is to alleviate the manpower consumption in deploying source codes automatically and can deploy hundreds of projects each day. Main contribution: architecture design, and implementation including main module, Crawler module, Putter module, Executor module and DB connection pool module. ( Python, SQL ).

    • DHW (Data High Way) is to transmit data with high speed among heterogenous data sources such as DB2, RAC, Greenplum, HDFS, MySQL, SQLServer ,etc. Main contribution: architecture design, implementation of distributing framework for DHW based on thrift, design and implement the distributing balance algorithm for transmitting data among different nodes in the cluster. , I/O module with HDFS. ( Python, JAVA )
    • Hotspot is to discover hot tables visited frequently according to the ETL job log and analyze the hot tables to optimize the data model for data warehouse. Main contribution: extract tables from SQL log based on SQL parsing; use an association rule algorithm to mine the latent relationship among a large number of tables.( C, Python).

    • Alihive, a library, is to access hive server and record the log for each SQL statement. Main contribution:design and implement the API to submit job to hive server. (perl).

    • Metadata Mining System is to apply data mining algorithm to discover underlying relationships among large number of tables in data warehouse, in order to optimize data model. In charge of extracting metadata from job log based on SQL parsing for ORALE, Greenplum, BIEE, MSTR and etc; developing association rule algorithm to mine metadata.

  • Alcatel-Lucent Company, Interned in MSGUI group as a software engineer (August 2007 - March 2008)

    • MAPIM is to manage network elements such as Nodes, Links,etc. Main contribution: according to the business logic of network management, complete data design including conceptual data model, logical data model, physical data model, design and implement the DBI(database access interface), and construct simulative data for database by developing CORBA application using Name Service and Notification Service( C++).

    • GUI Server is to provide service of user graphic rendering. Main contribution: customize graphic rendering service in GUI Server by XML; develop the module of responding the request commands from GUI Render Client; implement CORBA communication module between GUI Server and MAPIM using JacORB(Java).


Book Chapter
  1. Event Mining: Algorithms and Applications. Released by Chapman and Hall/CRC, September 2015. Chunqiu Zeng contributes to Chapter 4 (Pattern Discovery and Summarization Event Pattern Mining ) and Chapter 5 (Mining Time Lags).
  2. Maximizing Management Performance and Quality with Service Analytics. Released by IGI Global, August 2015. Chunqiu Zeng contributes to Chapter 7 (Tuning up IT Services using Monitoring Configuration Analytics).
  3. Data Mining Where Theory Meets Practice. Released by Xiamen University Press,2013, ISBN 9787561542941. Chunqiu Zeng contributes to Chapter 13 (Data Mining Application in Advanced Manufacturing).
  1. Tao Li, Yexi Jiang, Chunqiu Zeng, Bin Xia, Wubai Zhou, Wentao Wang, Liang Zhang, Li Xue, Dewei Bao "FLAP: An End-to-End Event Log Analysis Platform for System Management", In Proceedings of the 23nd annual ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2017.

  2. Wubai Zhou, Wei Xue, Ramesh Baral, Qing Wang, Chunqiu Zeng, Tao Li, Jian Xu, Zheng Liu, Larisa Shwartz, Genady Ya.Grabarnik, "STAR: A System for Ticket Analysis and Resolution", In Proceedings of the 23nd annual ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2017.

  3. Qing Wang, Wubai Zhou, Chunqiu Zeng, Tao Li, Larisa Shwartz, Genady Ya.Grabarnik, "Constructing the Knowledge Base for Cognitive IT Service Management", in Proceedings of the 14th IEEE International Conference on Services Computing (SCC), 2017. Best Student Paper

  4. Tao Li, Chunqiu Zeng, Yexi Jiang, Wubai Zhou, Liang Tang, Zheng Liu, Yue Huang, "DataDriven Techniques for Computing System Management", in ACM Computing Surveys, 2017.

  5. Chunqiu Zeng, Wubai Zhou, Tao Li, Larisa Shwartz, Genady Y. Grabarnik "Knowledge Guided Hierarchical Multi-Label Classification over Ticket Data ",in IEEE Transactions on Network and Service Management(TNSM), 2017.(In Press)

  6. Qifeng Zhou, Tao Li, Wei Xue, Chunqiu Zeng, Bin Xia, Ruyuan Han, Linkai Luo "An Advanced Inventory Data Mining System for Business Intelligence ",in Proceedings of IEEE Bigdata Service, 2017.(Accepted)

  7. Tao Li, Chunqiu Zeng, Wubai Zhou, Wei Xue, Yue Huang, Zheng Liu, Qifeng Zhou, Bin Xia, Qing Wang, Wentao Wang, Xiaolong Zhu "FIU-Miner (A Fast, Integrated, and User-Friendly System for Data Mining) and Its Applications",in Knowledge and Information Systems(KAIS), 2016.(In Press)

  8. Tao Li, Ning Xie, Chunqiu Zeng, Wubai Zhou, Li Zheng, Yexi Jiang, Yimin Yang, Hsin-Yu Ha, Wei Xue, Yue Huang, Shu-ching Chen, Jai Navlakha, and S.S. Iyengar, "Data-driven Techniques in Disaster Information Management",in ACM Computing Surveys, 2016.(In Press)

  9. Chunqiu Zeng, Qing Wang, Wentao Wang, Tao Li, Larisa Shwartz, "Online Inference for Time-Varying Temporal Dependency Discovery from Time Series",in Proceedings of the 4th annual IEEE International Conference on Big Data(IEEE Big Data), 2016. Regular Research Paper (acceptance rate: 18.68%)(pdf)

  10. Chunqiu Zeng, Liang Tang, Wubai Zhou, Tao Li, Larisa Shwartz, Genady Ya.Grabarnik, "An Integrated Framework for Mining Temporal Logs from Fluctuating Events",in IEEE Transactions on Services Computing(TSC), 2016.(In Press)

  11. Tao Li, Wubai Zhou, Chunqiu Zeng, Qing Wang, Qifeng Zhou, Dingding Wang, Jia Xu, Yue Huang, Wentao Wang, Minjing Zhang, Steve Luis, Shu-Ching Chen, Naphtali Rishe, "DI-DAP: An Efficient Disaster Information Delivery and Analysis Platform in Disaster Management",in Proceedings of the 25th ACM Conference on Information and Knowledge Management (CIKM 2016), Indianapolis, US, Oct.2016. Full paper at industrial track(acceptance rate:22/111=19.8%)(To Appear)

  12. Wubai Zhou , Liang Tang, Chunqiu Zeng, Tao Li, Larisa Shwartz, Genady Ya.Grabarnik, "Resolution Recommendation for Event Tickets in Service Management",in IEEE Transactions on Network and Service Management(TNSM), 2016.(In Press)

  13. Chunqiu Zeng, Qing Wang, Shekoofeh Mokhtari, Tao Li,"Online Context-Aware Recommendation with Time Varying Multi-Armed Bandit",In Proceedings of the 22nd annual ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2016. Research Track(acceptance rate:142/784=18%) (pdf,video).

  14. Jian Xu, Liang Tang, Chunqiu Zeng,Tao Li,"Pattern Discovery via Constraint Programming",Knowledge-Based Systems, vol.99, pages 23-32, 2016.(pdf)

  15. Jian Xu, Yexi Jiang, Chunqiu Zeng,Tao Li,"Node Anomaly Detection for HomoGeneous Distributed Environments",Expert Systems with Applications (ESWA 2015).(pdf)

  16. Liang Tang, Yexi Jiang, Lei Li, Chunqiu Zeng,Tao Li,"Personalized Recommendation via Parameter-Free Contextual Bandits",in Proceedings of the 38th Annual International ACM SIGIR Conference (SIGIR 2015) (oral presentation, acceptance rate: 70/351=20.0%).(pdf)

  17. Chunqiu Zeng,Liang Tang,Tao Li,Larisa Shwartz, Genady Ya. Graharnik."Mining Temporal Lag from Fluctuating Events for Correlation and Root Cause Analysis",The 10th International Conference on Network and Service Management (18/105 =17.1% CNSM 2014).(pdf)

  18. Chunqiu Zeng,Hongtai Li,Huibo Wang,Yudong Guang,Chang Liu,Tao Li,Mingjin Zhang,Shu-Ching Chen,Naphtali Rishe."Optimizing Online Spatial Data Analysis with Sequential Query Patterns",The 15th IEEE international Conference on Inofrmation Integration and Reuse(IRI 2014)(pdf)

  19. Lei Li, Chao Shen, Long Wang, Li Zheng, Yexi Jiang, Liang Tang, Hongtai Li, Longhui Zhang, Chunqiu Zeng, Tao Li. "iMiner: Mining Inventory Data for Intelligent Management.", in Proceedings of the 23rd ACM Conference on Information and Knowledge Management (CIKM 2014), Shanghai, China, Nov.2014 (Demo Paper) (To appear)

  20. Yexi Jiang, Chunqiu Zeng, Jian Xu, Tao Li."Real Time Contextual Collective Anomaly Detection over Multiple Data Streams",ACM SIGKDD Workshop Outliner Detection & Description under Data Diverity(SIGKDD WORKSHOP ODD^2),New York, USA, Aug. 2014.

  21. Li Zheng, Chunqiu Zeng, Lei Li, Yexi Jiang, Wei Xue, Jingxuan Li, Chao Shen, Wubai Zhou, Hongtai Li, Liang Tang, Tao Li, Bing Duan, Ming Lei, and Pengnian Wang."Applying Data Mining Techniques to Address Critical Process Optimization Needs in Advanced Manufacturing",in Proceedings of the 20th ACM Conference on Knowledge Discovery and Data Mining (KDD 2014), New York, USA, Aug. 2014.(pdf)

  22. Chunqiu Zeng, Tao Li, Larisa Shwartz, Genady Ya. Graharnik. "Hierarchical Multi-Label Classification over Ticket Data using Contextual Loss.", in Proceedings of IEEE/IFIP Network Operations and Management Symposium (NOMS'2014),2014(pdf)

  23. Li Zheng, Chao Shen, Liang Tang, Chunqiu Zeng, Tao Li, Steve Luis, Shu-Ching Chen. "Data Mining Meets the Needs of Disaster Information Management.", In IEEE Transactions on Human-Machine Systems,2013 (In Press)

  24. Chunqiu Zeng, Yexi Jiang, Li Zheng, Jingxuan Li, Lei Li, Hongtai Li, Chao Shen, Wubai Zhou, Tao Li, Bing Duan, Ming Lei,Pengnian Wang. "FIU-Miner: A Fast, Integrated, User-Friendly System for Data Mining in Distributed Environment.", ACM SIGKDD Conference on Knowledge Discovery and Data Mining,2013 (pdf)(online demo)

  25. Li Zheng, Chao Shen, Liang Tang,Chunqiu Zeng, Tao Li, Steve Luis, Shu-Ching Chen and Jainendra K. Navlakha. "Disaster SitRep - A Vertical Search Engine and Information Analysis Tool In Disaster Management Domain.",The 13th IEEE international Conference on Inofrmation Integration and Reuse(IRI 2012) (pdf)

  26. Chunqiu Zeng, Jie Zuo, Chuan Li, Kaikuo Xu, Shengqiao Ni, Liang Tang, Yue Zhang, Shaojie Qiao. "MPSQAR: Mining Quantitative Association Rules Preserving Semantics",in Proceedings of the International Conference on Advanced Data Mining and Applications (ADMA), pp. 572 - 580, 2008 (pdf) (EI access number: 20093412264445 )

  27. LI Chuan, TANG Changjie, ZENG Chunqiu, WU Jiang, CHEN Yu, QIU Jiangtao, ZHU Jun, DAI Li, JIANG Yongguang,"Discovering Multi-dimensional Major Medicines from Traditional Chinese Medicine Prescriptions", in Proceedings of 2008 International Conference on BioMedical Engineering and Informatics (BMEI), pp. 260 - 264, 2008 (pdf)

  28. Liang Tang, Changjie Tang, Lei Duan, Chuan Li, Yexi Jiang, Chunqiu Zeng, and Jun Zhu. "MovStream: An Efficient Algorithm for Monitoring Clusters Evolving in Data Streams", in Proceedings of the 2008 IEEE International Conference on Granular Computing (GrC), pp. 582 - 587, 2008 (pdf)

  29. Liang Tang, Changjie Tang, Yexi Jiang, Chuan Li, Lei Duan, Chunqiu Zeng, Kaikuo Xu. "TROADGrid: An Efficient Trajectory Outlier Detection Algorithm with Grid-based Space Division", in Proceedings of the National Database Conference (NDBC) (China), pp. 185 - 190, 2008 (pdf)

  30. Ni Shengqiao, Tang Changjie, Wang Yongwei, Li Chuan, Zhang Yue, Zeng Chunqiu, Tang Liang. "Boost Gene Expression Programming based on GPU", in Proceedings of the National Database Conference (NDBC) (China), pp. 227 - 233, 2008

  31. ZHANG Yue, TANG Changjie, LI Chuan, ZHU Jun, ZENG Chunqiu, TANG Liang, LIU Xianbin. "Mining Naive Intervention Rules in Birth Defect Data", Journal of Frontiers of Computer Science and Tehnology (China). VOl.1 No.2 ,2009.02, pp.188 - 197, 2009 (pdf)

Personal Projects and Source Libraries

  1. OpenMiner, a open source project, created by Liang Tang and I. This project provides a source framework for integrating data mining algorithms and responding graphic user interfaces in a pluggable way. Using Java developing language.

  2. Doodle Search, a project which provide a web user interface for doodling and searching similar doodles based on image content search engine. This project is completed by my classmates and I. Using C PLUS PLUS, C SHARP developing language.

  3. Text Replacer, a source library. this library provides a relatively faster way ( approximately triple in speed) to match text and replace text in big text file (above 10 gigabytes) than linux "sed" command, based on buffer mechanism to reduce the number of reading and writing data to disk. Using C developing language.

  4. Cluster algorithm based on Uncertain data, a source library used to cluster uncertain data. My final graduate thesis (The Research on the Key Techniques of Clustering over Uncertain Data ) is based on the source library which extends the K-Means, K-Median algorithm based on certain data to uncertain data, and implements related algorithm in my final thesis (Chinese), such as: UA-UK-Means, UA-UK-Median,A-UK-Means,A-UK-Median, PA-UK-Median, MA-UK-Means and etc (final thesis and ppt). Using C PLUS PLUS developing language.

  5. Random Data Generator, a source library used to generate random data. This library provides algorithms to generate data with kinds of distribution such as ?uniform distribution, Gaussian distribution and etc. Simulative data for expriments can be easy generated with the help of this library. Using C developing language.

Selected Honors

Contest Awards:

  1. 12/2006, The 2nd National Open Source Software Competition (China), Award finalists.

  2. 06/2006, Amway College Student's computer works competition, Second class award.

  3. 2004, Second Prize in Sichuan Contest District for National Udergraduate Mathematical Contest in Modeling.

Industrial Awards:

  1. 07/2005, Excellent intern of Hwadee Company, as a leader of data design team.

Scholarships & Honors:

  1. 1/2009, Excellent Graduated Graduate Student of Sichuan University in 2009.

  2. 12/2008, Excellent Graduate Student Award of Sichuan University in 2007-2008.

  3. 6/2006, Excellent Final Thesis of Bachelor Degree of Sichuan University.

  4. 11/2005, Admitted by Sichuan University for graduated study exempting from examination.

  5. 2002-2003, 2003-2004, Excellent Undergraduate Student of Sichuan University.

  6. 2002,2003,2004,2005, Undergraduate Student Scholarship of Sichuan University.

  7. 2002-2003, Awarded Scholarship of WESTERN CHINA METROPOLIS DAILY.