# CSE 640: Graph Mining and Management (Spring 2022)

 Instructor: A. Erdem Sariyuce (erdem AT buffalo.edu) Class hours: Mon, Wed 1:30-2:50, Nsc 216 Office hours: Wed 3:30-5:30, Online over Zoom (link)

## Course Description

Graphs are a universal language to deal with the complex data in science, nature, and technology. Scale and complexity of real-world networks (graphs) pose unprecedented challenges in the area of computer science. This course introduces several important problems and algorithms being used in many applications where the input data is a graph. Students will learn the challenges and problems in large-scale algorithmic graph mining, work on managing the graph data by handling the irregular computations that do not fit to traditional CPU architectures, be familiar with the recent advances in the area, and get hands-on experience with the graph mining and management research via semester-long project.

This course satisfies Master's Project requirement.

This course falls under Artificial Intelligence (AI) focus area and can be counted towards Master's or PhD's course requirements.

## Prerequisites

CSE 531: Analysis of Algorithms or the permission of the instructor. Assignments have programming components, hence a fundamental ability to code in C, C++, Java, or Python is expected. Depending on the subject, course project might also require some programming.

## Course Materials

No textbooks required. However, we will benefit from the following two books:

Networks, Mark Newman, Oxford University Press, 2018
Networks, Crowds, and Markets, Easley and Kleinberg, Cambridge University Press, 2010 (book)

 Homeworks: 30% : 3 * 10% (Three homeworks) Midterm: 20% (In class, date TBD) Project: 50% : 10% (meetings) + 10% (proposal report) + 10% (progress report) + 20% (final report)

## Homeworks

Questions will be about course material and occasionally require data analysis and algorithm design. Non-trivial programming skills are required in some language. Homeworks will be done individually. All assignment are due in a week after the assignment date. All homeworks should be typed and submitted as pdf by the beginning of the class on the due date.

All homeworks will be submitted to AutoLab.

Follow these steps to setup an account on Autolab (unless you already have one in which case you'll use your existing account):
2. By default, AutoLab will use your official UB first and last name. If you have a different preferred name, please let me know ASAP.
3. After you have done the above steps, you wait. I will upload a list of UB emails of students registered in the course (students cannot register themselves in a course). After that, you can just login into AutoLab using MyUB and you should see the CSE 640 course.

## Project

Teams of 2 might be allowed, depending on the class size. The content and scope of the project will be decided in consultation with the instructor -- project ideas will be provided. Each team will meet and update the instructor every other week: this will affect the project grade (see above: meetings in grading policy for project). There will be three milestones that includes reports and presentations; proposal, progress, and final.

- Proposal: Report will be a page long description of the project. It will be introduced with a short (10 mins) in-class presentation.
- Progress: What has been done so far should be explained in a report (up to 9 pages) and will also be presented in-class (15 mins).
- Final: The whole project and the results will be reported at the end (up to 9 pages) and will also be presented in-class (20 mins).

All project reports will be submitted to AutoLab.

## Piazza

All course related communications will occur at the piazza.

## Health and Safety Guidelines

All students must adhere to UB's Health and Safety Guidelines. In particular;

## Schedule

 Date Topic Notes Wed, Feb 2 Introduction. Course logistics, general introduction to real-world networks, interdisciplinary network science field and why computer science matters in that context. Mon, Feb 7 Review of Basics. Review on fundamental concepts in graph theory, overview of linear algebra and matrix operations. Lecture Reading: Newman: C1, C6 Easley & Kleinberg: C1, C2 Wed, Feb 9 PageRank I. Link analysis in networks, hubs and authorities, HITS algorithm, degree-driven metrics to determine important nodes and edges, use of PageRank in web and beyond. Lecture Reading: N: C7 EK: C13, C14 Mon, Feb 14 PageRank II. Link analysis in networks, hubs and authorities, HITS algorithm, degree-driven metrics to determine important nodes and edges, use of PageRank in web and beyond. [Math behind Google by D. Gleich] [History of PageRank by D. Gleich] Lecture Reading: N: C7 EK: C13, C14 Wed, Feb 16 Proposal Presentations Proposal reports due at 1:30pm. Mon, Feb 21 Graph Traversal I. Fundamental and practical algorithms for graph traversal, breadth-first search, depth-first search, strongly connected components, direction-optimizing BFS. Lecture Reading: N: C8 Wed, Feb 23 Graph Traversal II. Fundamental and practical algorithms for graph traversal, breadth-first search, depth-first search, strongly connected components, direction-optimizing BFS. Lecture HW 1 out. Mon, Feb 28 Shortest Paths and Centrality I. Single-source shortest paths, all-pairs shortest paths, practical algorithms for Katz, eigenvector, closeness, and betweenness centrality computations, adaptations for weighted graphs. Lecture Reading: N: C10 Wed, Mar 2 Shortest Paths and Centrality II. Single-source shortest paths, all-pairs shortest paths, practical algorithms for Katz, eigenvector, closeness, and betweenness centrality computations, adaptations for weighted graphs. [Incremental Closeness Centrality] Lecture HW 1 due at 1:30pm. Mon, Mar 7 Community Detection I. Graph clustering problem, spectral clustering, community definition and detection algorithms, evaluation metrics, modularity, graph conductance, types of communities in real-world networks, overlapping communities. [Hierarchical Link Clustering] Lecture Wed, Mar 9 Community Detection II. Graph clustering problem, spectral clustering, community definition and detection algorithms, evaluation metrics, modularity, graph conductance, types of communities in real-world networks, overlapping communities. [Empirical Comparison] [Motif clustering] Lecture Mon, Mar 14 Dense Subgraph Discovery I. Densest subgraph problem, dense subgraph models and measures, clique and quasi-cliques, connections to graph clustering and community detection, use of higher-order structures, core, truss, and nucleus decompositions. Lecture Wed, Mar 16 Dense Subgraph Discovery II. Densest subgraph problem, dense subgraph models and measures, clique and quasi-cliques, connections to graph clustering and community detection, use of higher-order structures, core, truss, and nucleus decompositions. Lecture Mon, Mar 28 MIDTERM EXAM. Wed, Mar 30 Progress Presentations. Progress reports due at 1:30pm. Mon, Apr 4 Progress Presentations. Wed, Apr 6 Graph Partitioning I. Definition of graph partitioning, applications in scientific computing and data mining, sparse matrix vector multiplication, Kernighan-Lin algorithm, load balancing, multi-level methods, streaming graph partitioning. Lecture Mon, Apr 11 Graph Partitioning II. Definition of graph partitioning, applications in scientific computing and data mining, sparse matrix vector multiplication, Kernighan-Lin algorithm, load balancing, multi-level methods, streaming graph partitioning. Lecture HW 2 out. Wed, Apr 13 Network Motifs. Subgraph patterns, mesoscale structures, triangles and higher-order structures, motif distributions per node/edge, adaptation for directed networks, motifs on bipartite graphs and limitations, connections to subgraph isomorphism. Lecture Mon, Apr 18 Temporal Graphs I. Temporal walks, paths, and reachability, basic graph metrics in temporal graphs such as subgraphs, connectivity, and centrality, models of temporal networks, temporal network motifs. Lecture HW 2 due at 1:30pm. Wed, Apr 20 Temporal Graphs II. Temporal walks, paths, and reachability, basic graph metrics in temporal graphs such as subgraphs, connectivity, and centrality, models of temporal networks, temporal network motifs. Lecture Fall'20 Lecture Mon, Apr 25 Streaming Graphs. Streaming models for graph algorithms, graph sketches, incremental methods to maintain graph analytics such as centrality, community detection, pagerank, and $k$-core computation. Lecture Fall'20 Lecture (starts at 22:37) HW 3 out. Wed, Apr 27 Parallel graph algorithms. Distributed-memory and shared-memory algorithms for graph problems, graph coloring. Lecture Mon, May 2 Machine (and Deep) Learning on Graphs I. Representation learning on graphs, applications in downstream ML tasks, embedding nodes, embedding subgraphs, bipartite graph embeddings, graph neural networks. Tutorial Lecture HW 3 due at 1:30pm. Wed, May 4 Machine (and Deep) Learning on Graphs II. Representation learning on graphs, applications in downstream ML tasks, embedding nodes, embedding subgraphs, bipartite graph embeddings, graph neural networks. Tutorial Lecture Mon, May 9 Final Presentations. Final reports due at 1:30pm. Wed, May 11 Final Presentations.

## Accessibility Resources

If you have a diagnosed disability (physical, learning, or psychological) that will make it difficult for you to carry out the course work as outlined, or that requires accommodations such as recruiting note-takers, readers, or extended time on exams or assignments, you must consult with Accessibility Resources (: 60 Capen Hall, : 716-645-2608, TTY: 716-645-2616, : 716-645-3116).

You must advise your instructor during the first two weeks of the course so that we may review possible arrangements for reasonable accommodations.

## Critical Campus Resources

### Sexual Violence

UB is committed to providing a safe learning environment free of all forms of discrimination and sexual harassment, including sexual assault, domestic and dating violence and stalking. If you have experienced gender-based violence (intimate partner violence, attempted or completed sexual assault, harassment, coercion, stalking, etc.), UB has resources to help. This includes academic accommodations, health and counseling services, housing accommodations, helping with legal protective orders, and assistance with reporting the incident to police or other UB officials if you so choose. Please contact UB’s Title IX Coordinator at 716-645-2266 for more information. For confidential assistance, you may also contact a Crisis Services Campus Advocate at 716-796-4399.

### Mental Health

As a student you may experience a range of issues that can cause barriers to learning or reduce your ability to participate in daily activities. These might include strained relationships, anxiety, high levels of stress, alcohol/drug problems, feeling down, health concerns, or unwanted sexual experiences. Counseling, Health Services, and Health Promotion are here to help with these or other issues you may experience. You can learn more about these programs and services by contacting:

### Counseling Services

• 120 Richmond Quad (North Campus), 716-645-2720
• 202 Michael Hall (South Campus), 716-829-5800

### Health Services

Michael Hall (South Campus), 716-829-3316

### Health Promotion

114 Student Union (North Campus), 716-645-2837

## Preferred Name

If you would like to be addressed by a name that is different from the one in UB records, please let us know and we will use your preferred name in our communications with you. Further, you will be able to use your preferred name in all of your exams and quizzes (the homeworks will be submitted online so this issue should not come up there).

## Diversity

The UB School of Engineering and Applied Sciences considers the diversity of its students, faculty, and staff to be a strength, critical to our success. We are committed to providing a safe space and a culture of mutual respect and inclusiveness for all. We believe a community of faculty, students, and staff who bring diverse life experiences and perspectives leads to a superior working environment, and we welcome differences in race, ethnicity, gender, age, religion, language, intellectual and physical ability, sexual orientation, gender identity, socioeconomic status, and veteran status.