In many cases there is still a large gap between the performance of current optimization technology and the requirements of real world applications. As in the past, performance will improve through a combination of more powerful solution methods and a general performance increase of computers. These factors are not independent. Due to physical limits, hardware development no longer results in higher speed for sequential algorithms, but rather in increased parallelism. Modern commodity PCs include a multi-core CPU and at least one GPU, providing a low cost, easily accessible heterogeneous environment for high performance computing. New solution methods that combine task parallelization and stream processing are needed to fully exploit modern computer architectures and profit from future hardware developments. This paper is the first part of a series of two, where the goal of this first part is to give a tutorial style introduction to modern PC architectures and GPU programming. We start with a short historical account of modern mainstream computer architectures, and a brief description of parallel computing. This is followed by the evolution of modern GPUs, before a GPU programming example is given. Strategies and guidelines for program development are also discussed. Part II gives a broad survey of the existing literature on parallel computing targeted at modern PCs in discrete optimization, with special focus on papers on routing problems. We conclude with lessons learnt, directions for future research, and prospects.