Hardware for AI and big data

Course info:

Semester: 5

General Foundation

ECTS: 6

Hours per week: 2

Professor: T.B.D.

Teaching style: Face to face

Grading: Written exams (40%), Essays / Projects (60%)

Activity	Workload
Lectures	26
Labs	13
Project	65
Independent study	46
Course total	150

Learning Results - Skills acquired
Course content
Recommended bibliography

Learning Results

The recent resurgence of the AI revolution has transpired because of synergistic advancements across big data sets, machine learning algorithms, and hardware. This course is designed to help students come up to speed on various aspects of hardware for machine learning, including basics of deep learning, deep learning frameworks, hardware accelerators, co-optimization of algorithms and hardware, training and inference, support for state-of-the-art deep learning networks. In particular, this course is structured around building hardware prototypes for machine learning systems using state-of-the-art platforms (e.g., FPGAs and ASICs).

Upon successful completion of this course the student will be able to:

• understand DNN, CNN and RNN applications in Artificial Intelligence tasks

• execute energy efficient training and inference of AI workloads in accelerator hardware and GPUs

• develop applications in AI software tools (Pytorch, TensorFlow) and hardware (Nvidia GPUs)

• suggest and justify an implementation architecture for applications with AI in embedded resource constrained systems

• analyze a complex big data computing problem and apply principles of distributed computing and to identify solutions

Skills acquired

Retrieve, analyze and synthesize data and information, with the use of necessary technologies
Teamwork
Decision making
Work in an interdisciplinary environment
Produce new research ideas
Promote free, creative and inductive thinking

Fundamentals of Machine Learning and Deep Learning
(Deep Neural Networks (DNN), Convolutional Neural Network (CNN), Recurrent Neural Networks (RNN))
Artificial intelligence and machine learning in hardware
(Acceleration Training and Inference in HW/ Dataflows
AI accelerators
(GPUs, Spatial Accelerators, Systolic Arrays, FPGAs, ASICs, SoC)
SW/HW design
(Embedded processing solutions, Software frameworks for neural networks, )
Energy-efficient training and inference in AI accelerator
(Deep learning at the edge, Static/Dynamic inference at the edge)
Benchmarking
(MLPref, DAWNbench)
Big Data analysis
(Data valuation/quality, Data security/privacy, Regulations around data sharing,)
Big data integration and Processing
(SW/HW requirements, Design and Implementation of an end to end big data application based on Hadoop and Spark platforms)

Park, H. and Kim, S., 2021. Hardware accelerator systems for artificial intelligence and machine learning. In Advances in Computers (Vol. 122, pp. 51-95). Elsevier.
V. Gokhale, et al., Snowflake: an efficient hardware accelerator for convolutional neural networks, in: Proc. International Symposium on Circuits and Systems (ISCAS 2017), Baltimore, MD, USA, 2017.
S. Mittal, A survey of FPGA-based accelerators for convolutional neural networks, Neural Comput. Applic. 32 (2018) 1–31.
V. Narasiman, et al., Improving GPU performance via large warps and two-level warp scheduling, in: Proc. IEEE/ACM International Symposium on Macroarchitecture, Porto Alegre, Brazil, 2011, pp. 308–317.
H. Li, et al., A high performance FPGA-based accelerator for large-scale convolutional neural networks, in: Proc. International Conference on Field Programmable Logic and Applications (FPL 2016), Lausanne, Switzerland, 2016.
S. Han, et al., EIE: efficient inference engine on compressed deep neural network, in: Proc. Symposium on Computer Architecture (ISCA 2016), Seoul, Korea, 2016.
D. Kim, J. Ahn, S. Yoo, A novel zero weight/activation-aware hardware architecture of convolutional neural network, in: Proc. Design, Automation & Test in Europe Conference & Exhibition (DATE 2017), Lausanne, Switzerland, 2017.

Learning Results - Skills acquired

Learning Results

Upon successful completion of this course the student will be able to:

• understand DNN, CNN and RNN applications in Artificial Intelligence tasks

• execute energy efficient training and inference of AI workloads in accelerator hardware and GPUs

• develop applications in AI software tools (Pytorch, TensorFlow) and hardware (Nvidia GPUs)

• suggest and justify an implementation architecture for applications with AI in embedded resource constrained systems

• analyze a complex big data computing problem and apply principles of distributed computing and to identify solutions

Skills acquired

Retrieve, analyze and synthesize data and information, with the use of necessary technologies
Teamwork
Decision making
Work in an interdisciplinary environment
Produce new research ideas
Promote free, creative and inductive thinking

Course content

Fundamentals of Machine Learning and Deep Learning
(Deep Neural Networks (DNN), Convolutional Neural Network (CNN), Recurrent Neural Networks (RNN))
Artificial intelligence and machine learning in hardware
(Acceleration Training and Inference in HW/ Dataflows
AI accelerators
(GPUs, Spatial Accelerators, Systolic Arrays, FPGAs, ASICs, SoC)
SW/HW design
(Embedded processing solutions, Software frameworks for neural networks, )
Energy-efficient training and inference in AI accelerator
(Deep learning at the edge, Static/Dynamic inference at the edge)
Benchmarking
(MLPref, DAWNbench)
Big Data analysis
(Data valuation/quality, Data security/privacy, Regulations around data sharing,)
Big data integration and Processing
(SW/HW requirements, Design and Implementation of an end to end big data application based on Hadoop and Spark platforms)

Recommended bibliography

Park, H. and Kim, S., 2021. Hardware accelerator systems for artificial intelligence and machine learning. In Advances in Computers (Vol. 122, pp. 51-95). Elsevier.
V. Gokhale, et al., Snowflake: an efficient hardware accelerator for convolutional neural networks, in: Proc. International Symposium on Circuits and Systems (ISCAS 2017), Baltimore, MD, USA, 2017.
S. Mittal, A survey of FPGA-based accelerators for convolutional neural networks, Neural Comput. Applic. 32 (2018) 1–31.
V. Narasiman, et al., Improving GPU performance via large warps and two-level warp scheduling, in: Proc. IEEE/ACM International Symposium on Macroarchitecture, Porto Alegre, Brazil, 2011, pp. 308–317.
H. Li, et al., A high performance FPGA-based accelerator for large-scale convolutional neural networks, in: Proc. International Conference on Field Programmable Logic and Applications (FPL 2016), Lausanne, Switzerland, 2016.
S. Han, et al., EIE: efficient inference engine on compressed deep neural network, in: Proc. Symposium on Computer Architecture (ISCA 2016), Seoul, Korea, 2016.
D. Kim, J. Ahn, S. Yoo, A novel zero weight/activation-aware hardware architecture of convolutional neural network, in: Proc. Design, Automation & Test in Europe Conference & Exhibition (DATE 2017), Lausanne, Switzerland, 2017.

University of West Attica

Artificial Intelligence and Data Science

Hardware for AI and big data

Course info:

Learning Results

Skills acquired

Learning Results

Skills acquired