Instructor: | A. Erdem Sariyuce (erdem AT buffalo.edu) |
Class hours: | MW 3:30-4:50, Park 146 |
Office hours: | TBA, Davis 323 |
Graphs are a universal language to deal with the complex data in science, nature, and technology. Scale and complexity of real-world networks (graphs) pose unprecedented challenges in the area of computer science. This course introduces several important problems and algorithms being used in many applications where the input data is a graph. Students will learn the challenges and problems in large-scale algorithmic graph mining, work on managing the graph data by handling the irregular computations that do not fit to traditional CPU architectures, be familiar with the recent advances in the area, and get hands-on experience with the graph mining and management research via semester-long project.
This course satisfies Master's Project requirement.
This course falls under Artificial Intelligence (AI) focus area and can be counted towards Master's or PhD's course requirements.
CSE 531: Analysis of Algorithms or the permission of the instructor. Assignments have programming components, hence a fundamental ability to code in C, C++, Java, or Python is expected. Depending on the subject, course project might also require some programming.
No textbooks required
Homeworks: | 40% : 4 * 10% (Four homeworks) |
Midterm: | 20% (In class) |
Project: | 40% : 5% (proposal) + 15% (progress)+ 20% (final) |
Questions will be about course material and occasionally require data analysis and algorithm design. Non-trivial programming skills are required in some language. Homeworks will be done individually. All assignment are due in a week after the assignment date. All homeworks should be typed and submitted by the beginning of the class on the due date.
Teams of 2 might be allowed, depending on the class size. The content and scope of the project will be decided in consultation with the instructor -- project ideas will be provided. Each team will meet and update the instructor every other week. There will be three milestones that includes reports and presentations (equal weighted); proposal, progress, and final.
- Proposal: Report will be a page long description of the project. It will be introduced with a short (5 mins) in-class presentation.General introduction to real-world networks, interdisciplinary network science field and why computer science matters in that context, review on fundamental concepts in graph theory, overview of linear algebra and matrix operations.
Link analysis in networks, hubs and authorities, HITS algorithm, degree-driven metrics to determine important nodes and edges, use of PageRank in web and beyond. (Project Proposal Reports due)
Fundamental and practical algorithms for graph traversal, breadth-first search, depth-first search, strongly connected components, direction-optimizing BFS, maximum flows & minimum cuts. (HW-1 out)
Single-source shortest paths, all-pairs shortest paths, practical algorithms for Katz, eigenvector, closeness, and betweenness centrality computations, adaptations for weighted graphs.
Graph clustering problem, community definition and detection algorithms, evaluation metrics, modularity, graph conductance, types of communities in real-world networks, overlapping communities.
Densest subgraph problem, dense subgraph models and measures, clique and quasi-cliques, connections to graph clustering and community detection, use of higher-order structures, core, truss, and nucleus decompositions. (HW-2 out)
Definition of graph partitioning, applications in scientific computing and data mining, sparse matrix vector multiplication, Kernighan-Lin algorithm, load balancing, multi-level methods, streaming graph partitioning.
Subgraph patterns, mesoscale structures, triangles and higher-order structures, motif distributions per node/edge, adaptation for directed networks, motifs on bipartite graphs and limitations, connections to subgraph isomorphism. (Project Progress Reports due)
Directed networks and challenges, graphs with categorical and numerical node/edge labels, bipartite networks and challenges, $k$-partite networks and applications.
Temporal walks, paths, and reachability, basic graph metrics in temporal graphs such as subgraphs, connectivity, and centrality, models of temporal networks, temporal network motifs. (HW-3 out)
Streaming models for graph algorithms, graph sketches, incremental methods to maintain graph analytics such as centrality, community detection, pagerank, and $k$-core computation.
Representation learning on graphs, applications in downstream ML tasks, embedding nodes, embedding subgraphs, bipartite graph embeddings, graph neural networks.
Review of parallel computing basics, shared-memory parallel programming in OpenMP and Cilk, shared-memory graph processing frameworks, GPU algorithms and frameworks for graph processing. (HW-4 out)
Distributed-memory graph processing frameworks, vertex-programming model, specialized distributed graph algorithms (graph coloring, centrality, $k$-core computation), connections to the graph partitioning. (Project Final Reports due)