Lecture Notes in Electrical Engineering Volume 90
For further volumes: http://www.springer.com/series/7818
Sio Iong ...

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Lecture Notes in Electrical Engineering Volume 90

For further volumes: http://www.springer.com/series/7818

Sio Iong Ao Len Gelman •

Editors

Electrical Engineering and Applied Computing

123

Editors Sio Iong Ao International Association of Engineers Unit 1, 1/F, 37-39 Hung To Road Kwun Tong Hong Kong e-mail: [email protected]

Len Gelman Applied Mathematics and Computing School of Engineering Cranfield University Cranfield UK e-mail: [email protected]

ISSN 1876-1100

e-ISSN 1876-1119

ISBN 978-94-007-1191-4

e-ISBN 978-94-007-1192-1

DOI 10.1007/978-94-007-1192-1 Springer Dordrecht Heidelberg London New York Ó Springer Science+Business Media B.V. 2011 No part of this work may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, microfilming, recording or otherwise, without written permission from the Publisher, with the exception of any material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Cover design: eStudio Calamar, Berlin/Figueres Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)

Preface

A large international conference in Electrical Engineering and Applied Computing was held in London, U.K., 30 June–2 July, 2010, under the World Congress on Engineering (WCE 2010). The WCE 2010 was organized by the International Association of Engineers (IAENG); the Congress details are available at: http://www.iaeng.org/WCE2010. IAENG is a non-profit international association for engineers and computer scientists, which was founded originally in 1968. The World Congress on Engineering serves as good platforms for the engineering community to meet with each other and exchange ideas. The conferences have also struck a balance between theoretical and application development. The conference committees have been formed with over two hundred members who are mainly research center heads, faculty deans, department heads, professors, and research scientists from over 30 countries. The conferences are truly international meetings with a high level of participation from many countries. The response to the Congress has been excellent. There have been more than one thousand manuscript submissions for the WCE 2010. All submitted papers have gone through the peer review process, and the overall acceptance rate is 57%. This volume contains fifty-five revised and extended research articles written by prominent researchers participating in the conference. Topics covered include Control Engineering, Network Management, Wireless Networks, Biotechnology, Signal Processing, Computational Intelligence, Computational Statistics, Internet Computing, High Performance Computing, and industrial applications. The book offers the state of the art of tremendous advances in electrical engineering and applied computing and also serves as an excellent reference work for researchers and graduate students working on electrical engineering and applied computing. Sio Iong Ao Len Gelman

v

Contents

1

Mathematical Modelling for Coal Fired Supercritical Power Plants and Model Parameter Identification Using Genetic Algorithms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Omar Mohamed, Jihong Wang, Shen Guo, Jianlin Wei, Bushra Al-Duri, Junfu Lv and Qirui Gao

2

Sequential State Computation Using Discrete Modeling . . . . . . . . Dumitru Topan and Lucian Mandache

3

Detection and Location of Acoustic and Electric Signals from Partial Discharges with an Adaptative Wavelet-Filter Denoising. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jesus Rubio-Serrano, Julio E. Posada and Jose A. Garcia-Souto

4

Study on a Wind Turbine in Hybrid Connection with a Energy Storage System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hao Sun, Jihong Wang, Shen Guo and Xing Luo

1

15

25

39

5

SAR Values in a Homogenous Human Head Model . . . . . . . . . . . Levent Seyfi and Ercan Yaldız

6

Mitigation of Magnetic Field Under Overhead Transmission Line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Adel Zein El Dein Mohammed Moussa

67

Universal Approach of the Modified Nodal Analysis for Nonlinear Lumped Circuits in Transient Behavior . . . . . . . . . Lucian Mandache, Dumitru Topan and Ioana-Gabriela Sirbu

83

7

53

vii

viii

8

Contents

Modified 1.28 Tbit/s (32 3 4 3 10 Gbit/s) Absolute Polar Duty Cycle Division Multiplexing-WDM Transmission Over 320 km Standard Single Mode Fiber . . . . . . . . . . . . . . . . . . . . . . . . . . . . Amin Malekmohammadi

9

Wi-Fi Wep Point-to-Point Links . . . . . . . . . . . . . . . . . . . . . . . . . J. A. R. Pacheco de Carvalho, H. Veiga, N. Marques, C. F. Ribeiro Pacheco and A. D. Reis

10

Interaction Between the Mobile Phone and Human Head of Various Sizes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Adel Zein El Dein Mohammed Moussa and Aladdein Amro

95

105

115

11

A Medium Range Gbps FSO Link . . . . . . . . . . . . . . . . . . . . . . . J. A. R. Pacheco de Carvalho, N. Marques, H. Veiga, C. F. Ribeiro Pacheco and A. D. Reis

12

A Multi-Classifier Approach for WiFi-Based Positioning System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jikang Shin, Suk Hoon Jung, Giwan Yoon and Dongsoo Han

135

Intensity Constrained Flat Kernel Image Filtering, a Scheme for Dual Domain Local Processing . . . . . . . . . . . . . . . . . . . . . . . Alexander A. Gutenev

149

13

14

15

16

17

Convolutive Blind Separation of Speech Mixtures Using Auditory-Based Subband Model . . . . . . . . . . . . . . . . . . . . . . . . . Sid-Ahmed Selouani, Yasmina Benabderrahmane, Abderraouf Ben Salem, Habib Hamam and Douglas O’Shaughnessy

125

161

Time Domain Features of Heart Sounds for Determining Mechanical Valve Thrombosis . . . . . . . . . . . . . . . . . . . . . . . . . . Sabri Altunkaya, Sadık Kara, Niyazi Görmüsß and Saadetdin Herdem

173

On the Implementation of Dependable Real-Time Systems with Non-Preemptive EDF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Michael Short

183

Towards Linking Islands of Information Within Construction Projects Utilizing RF Technologies . . . . . . . . . . . . . . . . . . . . . . . Javad Majrouhi Sardroud and Mukesh Limbachiy

197

Contents

18

A Case Study Analysis of an E-Business Security Negotiations Support Tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jason R. C. Nurse and Jane E. Sinclair

19

Smart Card Web Server. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lazaros Kyrillidis, Keith Mayes and Konstantinos Markantonakis

20

A Scalable Hardware Environment for Embedded Systems Education . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tiago Goncßalves, A. Espírito-Santo, B. J. F. Ribeiro and P. D. Gaspar

21

22

23

24

25

26

27

Yield Enhancement with a Novel Method in Design of Application-Specific Networks on Chips . . . . . . . . . . . . . . . . . Atena Roshan Fekr, Majid Janidarmian, Vahhab Samadi Bokharaei and Ahmad Khademzadeh

ix

209

221

233

247

On-Line Image Search Application Using Fast and Robust Color Indexing and Multi-Thread Processing . . . . . . . . . . . . . . . Wichian Premchaisawadi and Anucha Tungkatsathan

259

Topological Mapping Using Vision and a Sparse Distributed Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mateus Mendes, A. Paulo Coimbra and Manuel M. Crisóstomo

273

A Novel Approach for Combining Genetic and Simulated Annealing Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Younis R. Elhaddad and Omar Sallabi

285

Buyer Coalition Formation with Bundle of Items by Ant Colony Optimization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Anon Sukstrienwong

297

Coevolutionary Grammatical Evolution for Building Trading Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kamal Adamu and Steve Phelps

311

High Performance Computing Applied to the False Nearest Neighbors Method: Box-Assisted and kd-Tree Approaches . . . . . . . Julio J. Águila, Ismael Marín, Enrique Arias, María del Mar Artigao and Juan J. Miralles

323

x

28

29

30

31

32

33

34

35

36

37

38

Contents

Ethernet Based Implementation of a Periodic Real Time Distributed System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sahraoui Zakaria, Labed Abdennour and Serir Aomar

337

Preliminary Analysis of Flexible Pavement Performance Data Using Linear Mixed Effects Models. . . . . . . . . . . . . . . . . . . Hsiang-Wei Ker and Ying-Haur Lee

351

Chi-Squared, Yule’s Q and Likelihood Ratios in Tabular Audiology Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Muhammad Naveed Anwar, Michael P. Oakes and Ken McGarry

365

Optimising Order Splitting and Execution with Fuzzy Logic Momentum Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abdalla Kablan and Wing Lon Ng

377

The Determination of a Dynamic Cut-Off Grade for the Mining Industry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . P. V. Johnson, G. W. Evatt, P. W. Duck and S. D. Howell

391

Improved Prediction of Financial Market Cycles with Artificial Neural Network and Markov Regime Switching . . . . . . . . David Liu and Lei Zhang

405

Fund of Hedge Funds Portfolio Optimisation Using a Global Optimisation Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bernard Minsky, M. Obradovic, Q. Tang and Rishi Thapar

419

Increasing the Sensitivity of Variability EWMA Control Charts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Saddam Akber Abbasi and Arden Miller

431

Assessing Response’s Bias, Quality of Predictions, and Robustness in Multiresponse Problems . . . . . . . . . . . . . . . . . . . . Nuno Costa, Zulema Lopes Pereira and Martín Tanco

445

Inspection Policies in Service of Fatigued Aircraft Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nicholas A. Nechval, Konstantin N. Nechval and Maris Purgailis

459

Toxicokinetic Analysis of Asymptomatic Hazard Profile of Welding Fumes and Gases . . . . . . . . . . . . . . . . . . . . . . . . . . . Joseph I. Achebo and Oviemuno Oghoore

473

Contents

39

40

41

42

43

44

45

46

47

48

xi

Classification and Measurement of Efficiency and Congestion of Supply Chains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mithun J. Sharma and Song Jin Yu

487

Comparison of Dry and Flood Turning in Terms of Dimensional Accuracy and Surface Finish of Turned Parts . . . . . . . . . . . . . . . Noor Hakim Rafai and Mohammad Nazrul Islam

501

Coordinated Control Methods of Waste Water Treatment Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Magdi S. Mahmoud

515

Identical Parallel-Machine Scheduling and Worker Assignment Problem Using Genetic Algorithms to Minimize Makespan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Imran Ali Chaudhry and Sultan Mahmood Dimensional Accuracy Achievable in Wire-Cut Electrical Discharge Machining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mohammad Nazrul Islam, Noor Hakim Rafai and Sarmilan Santhosam Subramanian

529

543

Nash Game-Theoretic Model for Optimizing Pricing and Inventory Policies in a Three-Level Supply Chain . . . . . . . . . . . . Yun Huang and George Q. Huang

555

Operating Schedule: Take into Account Unexpected Events in Case of a Disaster. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Issam Nouaouri, Jean Christophe Nicolas and Daniel Jolly

567

Dynamic Hoist Scheduling Problem on Real-Life Electroplating Production Line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Krzysztof Kujawski and Jerzy S´wia˛tek

581

Effect of HAART on CTL Mediated Immune Cells: An Optimal Control Theoretic Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . Priti Kumar Roy and Amar Nath Chatterjee

595

Design, Development and Validation of a Novel Mechanical Occlusion Device for Transcervical Sterilization . . . . . . . . . . . . . Muhammad Rehan, James Eugene Coleman and Abdul Ghani Olabi

609

xii

49

50

51

52

Contents

Investigation of Cell Adhesion, Contraction and Physical Restructuring on Shear Sensitive Liquid Crytals . . . . . . . . . . . . . Chin Fhong Soon, Mansour Youseffi, Nick Blagden and Morgan Denyer On the Current Densities for the Electrical Impedance Equation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Marco Pedro Ramirez Tachiquin, Jose de Jesus Gutierrez Cortes, Victor Daniel Sanchez Nava and Edgar Bernal Flores

623

637

Modelling of Diseased Tissue Diffuse Reflectance and Extraction of Optical Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shanthi Prince and S. Malarvizhi

649

Vertical Incidence Increases Virulence in Pathogens: A Model Based Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Priti Kumar Roy, Jayanta Mondal and Samrat Chatterjee

661

53

Chaotic Oscillations in Hodgkin–Huxley Neural Dynamics . . . . . . Mayur Sarangdhar and Chandrasekhar Kambhampati

54

Quantification of Similarity Using Amplitudes and Firing Times of a Hodgkin–Huxley Neural Response . . . . . . . . . . . . . . . Mayur Sarangdhar and Chandrasekhar Kambhampati

687

Reduction of HIV Infection that Includes a Delay with Cure Rate During Long Term Treatment: A Mathematical Study . . . . . . Priti Kumar Roy and Amar Nath Chatterjee

699

55

675

Chapter 1

Mathematical Modelling for Coal Fired Supercritical Power Plants and Model Parameter Identification Using Genetic Algorithms Omar Mohamed, Jihong Wang, Shen Guo, Jianlin Wei, Bushra Al-Duri, Junfu Lv and Qirui Gao

Abstract The paper presents the progress of our study of the whole process mathematical model for a supercritical coal-fired power plant. The modelling procedure is rooted from thermodynamic and engineering principles with reference to the previously published literatures. Model unknown parameters are identified using Genetic Algorithms (GAs) with 600MW supercritical power plant on-site measurement data. The identified parameters are verified with different sets of measured plant data. Although some assumptions are made in the modelling process to simplify the model structure at a certain level, the supercritical

O. Mohamed (&) J. Wang S. Guo J. Wei School of Electrical, Electronics, and Computer Engineering, University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK e-mail: [email protected] J. Wang e-mail: [email protected] S. Guo e-mail: [email protected] J. Wei e-mail: [email protected] B. Al-Duri School of Chemical Engineering, University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK e-mail: [email protected] J. Lv Q. Gao Department of Thermal Engineering, Tsinghua University, Beijing, People’s Republic of China e-mail: [email protected] Q. Gao e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_1, Springer Science+Business Media B.V. 2011

1

2

O. Mohamed et al.

coal-fired power plant model reported in the paper can represent the main features of the real plant once-through unit operation and the simulation results show that the main variation trends of the process have good agreement with the measured dynamic responses from the power plants. Nomenclature ff Fitness function for genetic algorithms ffr Pulverized fuel flow rate (kg/s) h Enthalpy per unit mass (MJ/kg) K Constant parameter k Mass flow rate gain m Mass (kg) m_ Mass flow rate (kg/s) P Pressure of a heat exchanger (MPa) Heat transfer rate (MJ/s) Q_ R Response T Temperature (C) t Time (s) s Time constant (s) U Internal energy (MJ) V Volume of fluid (m3) _ Work rate or power (MW) W x Generator reactance (p.u) y Output vector q Density (kg/m3) v Valve opening d Rotor angle (rad) h Mechanical angle (rad) x Speed (p.u) C Torque (p.u) Subscripts a Accelerating air Air e Electrical d Direct axis ec Economizer hp High pressure turbine hx Heat exchanger i Inlet ip Intermediate pressure turbine me Mechanical ms Main steam m Measured

1 Mathematical Modelling

o out q rh sh si ww

3

Outlet Output of the turbine Quadrature axis Reheater Superheater Simulated Waterwall

Abbreviations BMCR Boiler maximum continuous rate ECON Economizer GA Genetic algorithm HP High pressure HX Heat exchanger IP Intermediate pressure MS Main steam RH Reheater SC Supercritical SH Superheater WW Waterwall

1.1 Introduction The world is now facing the challenge of the issues from global warming and environment protection. On the other hand, the demand of electricity is growing rapidly due to economic growth and increases in population, especially in the developing countries, for example, China and India. With the consideration of environment and sustainable development in energy, renewable energy such as wind, solar, and tidal wave should be only resources to be explored in theory. But the growth in demand is also a heavy factor in energy equations so the renewable energy alone is unlikely able to generate sufficient electricity to fill the gap in the near future. Power generation using fossil fuels is inevitable, especially, coal fired power generation is found to be an unavoidable choice due to its huge capacity and flexibility in load following. As a well know fact, the conventional coal fired power plants have a huge environmental impact and lower energy conversion efficiencies. Any new coal fired power plants must be cleaner with more advanced and improved technologies. Apart from Carbon Capture and Storage, supercritical power plants might be the most suitable choice with consideration of the factors in environmental enhancement, higher energy efficiency and economic growth. However, there has

4

O. Mohamed et al.

been an issue to be addressed in its dynamic responses and performance in relation with conventional subcritical plants due to the difference in the process structure and energy storage drum [1]. The characteristics of supercritical plants require the considerable attention and investigation. Supercritical boilers have to be oncethrough type boilers because there is not distinction between water and steam phases in supercritical process so there is no drum to separate water steam mixture. Due to the absence of the drum, the once-through boilers have less stored energy and faster responses than the drum-type boilers. There are several advantages of supercritical power plants [2, 3] over traditional subcritical plants include: • • • •

Higher cycle efficiency (Up to 46%) and lower fuel consumption. Reduced CO2 emissions per unit power generation. Be fully integratable with CO2 capture technology. Fast load demand following (in relatively small load demand changes).

However, some concerns are also raised in terms of its dynamic responses with regards to the demand for dynamic response speed. This is mainly caused by its once-through structure, that is, there is no drum to store energy as a buffer to response rapid changes in load demand. The paper is to develop a mathematical model for the whole plant process to study dynamic responses aiming at answering the questions in dynamic response speed. From the literature survey, several models have been reported with emphasis on different aspects of the boiler characteristics. Studying the dynamic response and control system of once-through supercritical (SC) units can be traced back to 1958 when work was done on a time-based simulation for Eddystone I unit of Philadelphia Electric Company and the work was extended for simulation of Bull run SC generation unit later in 1966 [4]. Yutaka Suzuki et al. modelled a once through SC boiler in order to improve the control system of an existing supercritical oil-fired plant. The model was based on nonlinear partial differential equations, and the model was validated through simulation studies [5]. Wataro Shinohara et al. presented a simplified state space model for SC once through boiler-turbine system and designed a nonlinear controller [6]. Pressure node model description was introduced by Toshio Inoue et al. for power system frequency simulation studies [7]. Intelligent techniques contributions have yielded an excellent performance for modeling. Neural network has been introduced to model the SC power plant with sufficiently accurate results if they are trained with suitable data provided by operating unit [8]. However, neural network performances are unsatisfactory to simulate some emergency conditions of the plant because NN method depends entirely on the data used for the learning process, not on physical laws. Simulation of SC boilers may be achieved either theoretically based on physical laws or empirically based on experimental work. In this paper, the proposed mathematical model is based on thermodynamic principles and the model parameters are identified by using the data obtained from a 600MW SC power plant [9]. The simulation results show that the model is trustable to simulate the whole once-through mode of operation at a certain level of accuracy.

1 Mathematical Modelling

5

1.2 Mathematical Model of the Plant 1.2.1 Plant Description The unit of a once-through supercritical 600MW power plant is selected for the modelling study. The schematic view of the boiler is shown in Fig. 1.1. Water from the feedwater heater is heated in the economizer before entering the superheating stages through the waterwall. The superheater consists of three sections which are low temperature superheater, platen superheater, and final stage superheater. The main outlet steam temperature is about 571C at the steady state and a pressure is 25.5 MPa. There are 2 reheating sections in the boiler for reheating the steam exhausted from the high pressure turbine. The inlet temperature of the reheater is 309C and the outlet temperature is nearly 571C and average pressure is 4.16 MPa. The reheated steam is used to energize the intermediate pressure turbine. The mechanical power is generated through multi-stage turbines to provide an adequate expansion of the steam through the turbine and subsequently high thermal efficiency of the plant.

1.2.2 Assumptions Made for Modelling Assumptions are made to simplify the process which should be acceptable by plant engineers and sufficient to transfer the model from its complex physical model to lead to simple mathematical model for the research purpose. Some of these assumptions are usually adopted for modelling supercritical or subcritical boilers [10]. Modelling in the work reported in the paper, the following general assumptions are made: Fig. 1.1 Schematic view of the plant

6

O. Mohamed et al.

• Fluid properties are uniform at any cross section, and the fluid flow in the boiler tubes is one-phase flow. • In the heat exchanger, the pipes for each heat exchanger are lumped together to form one pipe. • Only one control volume is considered in the waterwall. • The dynamic behaviour of the air and gas pressure is neglected.

1.2.3 The Boiler Model 1.2.3.1 Heat Exchanger Model The various heat exchangers in the boiler are modelled by the principles of mass and energy balances. The sub-cooled water in the economizer is transferred directly to a supercritical steam through the waterwall without passing the evaporation status. The equations are converted in terms of the derivatives (or variation rates) pressure and temperature of the heat exchanger. The mass balance equation of the heat exchanger (control volume) is: dm ¼ m_ i m_ o dt

ð1:1Þ

For the constant effective volume, Eq. 1.1 will be: V

dq ¼ m_ i m_ o dt

The density is a differentiable function of two variables which can be the temperature and pressure inside the control volume, thus we have: oq dP oq dT ¼ m_ i m_ o V þ oPT dt oT P dt The energy balance equation: dUhx ¼ Q_ hx þ m_ i hi m_ o ho dt Also, dUhx oq dP oq dT oh dP oh dT þ ¼V h þq þ dt oPT dt oT P dt oPT dt oT P dt dP oq dP oq dT oh dP oh dT þ q V h þ þ V dt oPT dt oT P dt oPT dt oT P dt V

dP _ Qhx þ m_ i hi m_ o ho dt

ð1:2Þ

1 Mathematical Modelling

7

Combining (1.1) and (1.2) to get the pressure and temperature state derivatives, Q_ hx þm_ i Hi m_ o Ho P_ ¼ ð1:3Þ s T_ ¼ Cðm_ i m_ o Þ DP_

ð1:4Þ

! oh qoT hi h oq P

ð1:5Þ

! oh qoT ho h oq P

ð1:6Þ

Where: Hi ¼

oT P

Ho ¼

oT T

! oh oq oh qoPT :oT s ¼ V q oq P 1 oP T oT

ð1:7Þ

P

C¼

1

ð1:8Þ

oq oP T oq oT P

ð1:9Þ

oq V oT P

D¼

The temperature of the superheater is controlled by the attemperator. Therefore, the input mass flow rate to the superheater is the addition of the SC steam and the water spray from the attemperator. The amount of attemperator water spray is regulated by opening the spray valve which responds to a signal from the PI controller. This prevents the high temperature fluctuation and ensures maximum efficiency over a wide range of operation. 1.2.3.2 Fluid Flow The fluid flow in boiler tubes for one-phase flow is : pﬃﬃﬃﬃﬃﬃﬃ m_ ¼ k DP

ð1:10Þ

Equation 1.10 is the simplest mathematical expression for fluid flow in boiler tubes. The flow out from the reheater and main steam respectively are: Prh m_ rh ¼ K10 pﬃﬃﬃﬃﬃﬃvrh Trh

ð1:11Þ

Pms m_ ms ¼ K20 pﬃﬃﬃﬃﬃﬃﬃvms Tms

ð1:12Þ

The detailed derivation of (1.11) and (1.12) can be found in [11].

8

O. Mohamed et al.

1.2.4 Turbine/Generator Model 1.2.4.1 Turbine Model The turbine is modeled through energy balance equations and then is combined with the boiler model. The work done by high pressure and intermediate pressure turbines are: _ hp ¼ m_ ms ðhms hout Þ W

ð1:13Þ

_ ip ¼ m_ rh ðhrh hout Þ W

ð1:14Þ

The mechanical power of the plant: _ hp þ W _ ip Pme ¼ W

ð1:15Þ

Up to Eq. 1.14, the boiler-turbine unit is model in a set of combined equations and can be used for simulation if we assume that the generator is responding instantaneously. However, the dynamics of the turbines’ speeds and torques must be affected by the generator dynamics and injecting the mechanical power only into the generator model will not provide this interaction between the variables. To have a strong coupling between the variables in the models of the turbine-generator, torque equilibrium equations for the turbine model are added to the turbine model: x_ hp ¼

x_ ip ¼

1 Chp Dhp xhp KHI ðhhp hip Þ Mhp

ð1:16Þ

h_ hp ¼ xb ðxhp 1Þ ¼ ðxhp 1Þ

ð1:17Þ

1 Cip Dip xip þKHI ðhhp hip Þ KIG ðhhp hg Þ Mip

ð1:18Þ

h_ ip ¼ xb ðxip 1Þ ¼ ðxip 1Þ

ð1:19Þ

Note that, for two-pole machine: hg ¼ d

1.2.4.2 Generator Model The generator models are reported in a number of literatures; a third order nonlinear model is adopted in our work [12]: d_ ¼ Dx

ð1:20Þ

JDx_ ¼ Ca ¼ Cm Ce DDx

ð1:21Þ

1 Mathematical Modelling

1 0 0 E e x x FD d q d id 0 Tdo V 0 V2 1 1 Ce ðp:u) Pe ðp:u) 0 eq sin d þ sin 2d 2 xq x0d xd e_ 0q ¼

9

ð1:22Þ

ð1:23Þ

1.3 Model Parameter Identification 1.3.1 Identification Procedures The parameters of the model which are defined by the formulae from (1.3) to (1.7) and the other parameters of mass flow rates’ gains, heat transfer constants, turbine, and generator parameters are all identified by Genetic Algorithms in a sequential manner. Even though some of these parameters are inherently not constant, these parameters are fitted directly to the actual plant response to save time and effort. Various data sets of boiler responses have been chosen for identification and verification. First, the parameters of pressure derivatives equations are indentified. Then, the identification is extended to include the temperature equations, the turbine model parameters and finally generator model parameters. The measured responses which are chosen for identification and verification are: • • • • • • • •

Reheater pressure. Main SC steam pressure. Main SC steam temperature. Mass flow rate of SC steam from boiler main outlet to HP turbine. Mass flow rate of reheated steam from reheater outlet to the IP turbine. Turbine speed. Infinite bus frequency. Generated power of the plant.

In recent years, Genetic Algorithms optimization tool has been widely used for nonlinear system identification and optimization due to its many advantages over conventional mathematical optimization techniques. It has been proved that the GAs tool is a robust optimization method for parameters identification of subcritical boiler models [13]. Initially, the GAs produces random values for all the parameters to be identified and called the initial population. Then, it calculates the corresponding fitness function to recopy the best coded parameter in the next generation. The GAs termination criteria depend on the value of the fitness function. If the termination criterion is not met, the GA continues to perform the three main operations which are reproduction, crossover, and mutation. The fitness function for the proposed task is:

10

O. Mohamed et al.

ff ¼

N X

ðRm Rsi Þ2

ð1:24Þ

n¼1

The fitness function is the sum of the square of the difference between measured and simulated responses for each of the variables mentioned in this section. N is the number of points of the recorded measured data, The load-up and load-down data have been used for identification. The changes are from 30% to 100% of load and down to 55% to verify the model derived. The model is verified from a ramp load up data and steady state data to cover a large range of once-through operation. The model has been also verified by a third set of data. The GAs parameters setting for identification are listed below: Generation: 100 Population type: double vector Creation function: uniform Population size: 50–100 Mutation rate: 0.1 Mutation function: Gaussian Migration direction: forward Selection: stochastic uniform Figure 1.2 shows some of the load-up identification results. It has been observed that the measured and simulated responses are very well matched for the power generated and they are also reasonably matched for the temperature. Some parameters of the boiler model are listed in Table 1.1 and for heat transfer rates are listed in Table 1.2.

1.3.2 Model Parameter Verification The validation of the proposed model has been performed using a number of data sets which are the load down and steady state data. Figure 1.3 shows some of the simulated verification results (load-down and steady state simulation). From the results presented, it is obvious that the model response and the actual plant response are well agreed to each other.

1.4 Concluding Remarks A mathematical model for coal fired power generation with the supercritical boiler has been presented in the paper. The model is based on thermodynamic laws and engineering principles. The model parameters are identified using on-site operating

1 Mathematical Modelling

11 800

600

Power (MW)

Main Steam pressure (MPa)

30

20

10

400

200

0

0

100

200

300

400

0

500

0

100

200

300

400

500

400

500

Time(min)

Time(min) 580

600

Steam Flow(Kg/s)

Main Steam temperature (C)

560

540

520

400

200

500

480

0

100

200

300

400

Time(min)

0

500

0

100

Model response

200

300

Time(min)

Plant response

Fig. 1.2 Identification results

Table 1.1 Heat exchanger parameter

HX

Hi

Ho

C

D

ECON WW SH RH

10.2 12.2 20.5 19.8

13.6 13.3 45.9 22.0

2.1e-6 -1.2e-6 1e-6 -1e-6

-3.93 -0.1299 -3.73 -17.9

Table 1.2 Heat transfer rate

s1(s)

Kec

Kww

Ksh

Krh

9.3

5.7785

7.78

23.776

21.43

data recorded. The model is then verified by using different data sets and the simulation results show a good agreement between the measured and simulated data. For future work, the model will be combined with a nonlinear mathematical

12

O. Mohamed et al. 590

700

Power (MW)

Main steam temperature (C)

580

570

560

500

400

550 540

600

0

100

200

300 400

500

600

300

700

0

100

200

Time(min) 10

400

500

600

700

600

8

Steam flow(Kg/s)

Reheater pressure (MPa)

300

Time(min)

6

4

400

200

2

0

0

200

400

Time(min)

600

800

0

0

Plant response

200

400

600

800

Time(min)

Model response

Fig. 1.3 Verification results

model of coal mill to obtain a complete process mathematical model from coal preparation to electricity generation. It is expected that the mill local control system should have great contributions in enhancing the overall control of the plant. Acknowledgments The authors would like to give our thanks to E.ON Engineering for their support and engineering advices. The authors also want to thank EPSRC (RG/G062889/1) and ERD/AWM Birmingham Science City Energy Efficiency and Demand Reduction project for the research funding support.

References 1. Kundur P (1981) A survey of utility experiences with power plant response during partial load rejection and system disturbances. IEEE Trans Power Apparatus Syst PAS-100(5): 2471–2475 2. Laubli F, Fenton FH (1971) The flexibility of the supercritical boiler as a partner in power system design and operation: part I. IEEE Trans Power Apparatus Syst PAS-90(4): 1719–1724

1 Mathematical Modelling

13

3. Laubli F, Fenton FH (1971) The flexibility of the supercritical boiler as a partner in power system design and operation: part II. IEEE Trans Power Apparatus Syst PAS-90(4): 1725–1733 4. Littman B, Chen TS (1966) Simulation of bull-run supercritical generation unit. IEEE Trans Power Apparatus Syst 85:711–722 5. Suzuki Y, Sik P, Uchida Y (1979) Simulation of once-through supercritical boiler. Simulation 33:181–193 6. Shinohara W, Kotischek DE (1995) A simplified model based supercritical power plant controller. In: Proceeding of the 35th IEEE Conference on Decision and Control, vol 4, pp 4486–4491 7. Inoue T, Taniguchi H, Ikeguchi Y (2000) A Model of Fossil Fueled Plant with Once-through Boiler for Power System Frequency Simulation Studies. IEEE Trans Power Syst 15(4): 1322–1328 8. Lee KY, Hoe JS, Hoffman JA, Sung HK, Won HJ (2007) Neural network based modeling of large scale power plant. IEEE Power Engineering Society General Meeting No (24–28):1–8 9. Mohamed O, Wang J, Guo S, Al-Duri B, Wei J (2010) Modelling study of supercritical power plant and parameter identification using genetic algorithms. In: Proceedings of the World Congress on Engineering II, pp 973–978 10. Adams J, Clark DR, Luis JR, Spanbaur JP (1965) Mathematical modelling of once-through boiler dynamics. IEEE Trans Power Apparatus Syst 84(4):146–156 11. Salisbury JK (1950) Steam turbines & their cycles. Wiley, New York 12. Yu Y-N (1983) Electric power system dynamics. Academic Press, New York 13. Ghaffari A, Chaibakhsh A (2007) A simulated model for a once through boiler by parameter adjustment based on genetic algorithms. Simul Model Pract Theory 15:1029–1051

Chapter 2

Sequential State Computation Using Discrete Modeling Dumitru Topan and Lucian Mandache

Abstract In this paper we present a sequential computation method of the state vector, for pre-established time intervals or punctually. Based on discrete circuit models with direct or iterative companion diagrams, the proposed method is intended to a wide range of analog dynamic circuits: linear or nonlinear circuits with or without excess elements or magnetically coupled inductors. Feasibility, accessibility and advantages of applying this method are demonstrated by the enclosed example.

2.1 Introduction The discretization of the circuit elements, followed by corresponding companion diagrams, leads to discrete circuit models associated to the analyzed analog circuits [1–3]. Using the Euler, trapezoidal or Gear approximations [4, 5], simple discretized models are generated, whose implementation leads to an auxiliary active resistive network. In this manner, the numerical computation of desired dynamic quantities becomes easier and faster. Considering the time constants of the circuit, the discretization time step can be adjusted for reaching the solution optimally, in terms of precision and computation time. D. Topan (&) Faculty of Electrical Engineering, University of Craiova, 13 A.I. Cuza Str., Craiova, 200585, Romania e-mail: [email protected] L. Mandache Faculty of Electrical Engineering, University of Craiova, 107 Decebal Blv., Craiova, 200440, Romania e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_2, Ó Springer Science+Business Media B.V. 2011

15

16

D. Topan and L. Mandache

The discrete modeling of nonlinear circuits assumes an iterative process too, that requires updating the parameters of the companion diagram at each iteration and each integration time step [5, 6]. If nonzero initial conditions exist, they are computed usually through a steady state analysis performed prior to the transient analysis. The discrete modeling can be associated to the state variables approach [6, 7], as well as the modified nodal approach [5, 8], the analysis strategy being chosen in accordance with the circuit topology, the number of the energy storage circuit elements (capacitors and inductors) and the global size of the circuit. The known computation algorithms based on the discrete modeling allow the sequential computation, step by step, along the whole analysis time, of the state vector or output vector directly [5, 9, 10]. In this paper, one proposes a method that allows computing the state vector punctually, at the moments considered significant for the dynamic evolution of the circuit. Thus, the sequential computation for pre-established time subdomains is allowed.

2.2 Modeling Through Companion Diagrams The time domain analysis is performed for the time interval [t0, tf], bounded by the initial moment t0 and the final moment tf. It can be discretized with the constant time step h, chosen sufficiently small in order to allow using the Euler, trapezoidal or Gear numerical integration algorithms [1–5]. One can choose t0 = 0 and tf = wh, where w is a positive integer. The analog circuit analysis using discrete models requires replacing each circuit element through a proper model according to its constitutive equations. In this way, if the Euler approximation is used, the discretization equations and the corresponding discrete circuit models associated to the energy storage circuit elements are shown in Table 2.1, for the time interval ½nh; ðn þ 1Þh ; h\w. The tree capacitor voltages uC and the cotree inductor currents iL [7, 8] are chosen as state quantities, assembled in the state vector x. The currents IC of the tree capacitors and the voltages across the cotree inductors UL are complementary variables, assembled in the vector X. At the moment t ¼ nh, the above named vectors are partitioned as: n n u I xn ¼ nC ; Xn ¼ Cn ð2:1Þ iL UL with obvious significances of the vectors unC ; inL ; InC ; UnL . For the magnetically coupled inductors, the discretized equations and the companion diagram are shown in Table 2.1, where the following notations were used: L11 L12 L11 n L12 n ¼ ; Rnþ1 ; enþ1 i þ i; Rnþ1 11 ¼ 12 ¼ 1 h h h 1 h 2 ð2:2Þ L22 L21 L22 n L21 n nþ1 nþ1 ; R ; e i i ¼ ¼ ¼ þ : Rnþ1 22 21 2 h h h 2 h 1

Magnetically coupled inductor pair

Excess inductor

Cotree inductor

Excess capacitor

Tree capacitor

2

1

IL

iL

iC

IC

i2n +1

i1n+1

L22

L11

uL

L = 1/ Γ

UL

L = 1/ Γ

UC

C = 1/ S

uC

C = 1/ S

Table 2.1 Discrete modeling of the energy storage elements Element Symbol

U 2n+1

*

L12

*

U 1n+1

2'

L21

1'

¼

þ

hSICnþ1

1 hC

ILnþ1

n

þR22 inþ1 R22 in2 2

U2nþ1 ¼ R21 inþ1 R21 in1 1

R11 in1 R12 in2

ILn

¼ R11 inþ1 1 þR12 inþ1 2

¼

U1nþ1

unþ1 L

inþ1 ¼ inL þ hCULnþ1 L

unC

1 inþ1 ¼ hS UCnþ1 UC C

unþ1 C

Discretized expressions

2

1 i 2n +1

i1n +1

e 2n +1

R 22

R11

U 1n +1

U 2n +1

R 21 i1n +1

R12 i 2n +1

1 n IL hΓ

iLn

U Ln+1

1 n UC hS

U Cn +1

uCn

u Ln +1

hΓ

e1n +1

I Ln +1 1 / hΓ

i Ln

iCn +1

hS

u Cn+1

I Cn+1 hS

Companion diagram

2'

1'

2 Sequential State Computation Using Discrete Modeling 17

18

D. Topan and L. Mandache

Table 2.2 Iterative discrete modeling Element Iterative dynamic parameter i Rnþ1; m ¼ oo ui i¼inþ1; m

Companion diagram n +1, m n +1, m e i n +1, m +1 R

u

u = uˆ (i ) i

q

C nþ1; m ¼

Lnþ1; m ¼

u n +1, m +1

oq o u u¼unþ1; m

1 Rnþ1; m ¼ Lnþ1; m h

ou o i i¼inþ1; m

1 enþ1; m ¼ unþ1; m Lnþ1; m h inþ1; m

ϕ = ϕˆ (i ) Gnþ1; m ¼

o i

Snþ1; m ¼ u

q = qˆ (u ) i

ϕ

Cnþ1; m ¼ u

i = iˆ (ϕ )

unþ1; m

ou o q q¼qnþ1; m

oi o u u¼unþ1; m

Gnþ1; m ¼ Gnþ1; m jnþ1; m ¼ inþ1; m Gnþ1; m

i n+1, m +1

i = iˆ(u ) q

j n +1, m

o u u¼unþ1; m

u

i

Rnþ1; m ¼ hSnþ1; m enþ1; m ¼ unþ1; m hSnþ1; m

u

i

enþ1; m ¼ unþ1; m Rnþ1; m

inþ1; m

u = uˆ (q ) ϕ

Rnþ1; m ¼ Rnþ1; m inþ1; m

u

i

Notations in the companion diagram

G u

n +1, m

n +1, m +1

1 Gnþ1; m ¼ Cnþ1; m h 1 nþ1; m nþ1; m j ¼i Cnþ1; m h unþ1; m Gnþ1; m ¼ hCnþ1; m jnþ1; m ¼ inþ1; m hCnþ1; m unþ1; m

For nonlinear circuits, the state variable computation at the moment t ¼ ðn þ 1Þh requires an iterative process that converges towards the exact solution [4, 5]. A second upper index corresponds to the iteration order (see Table 2.2). Similar results to those of Tables 2.1 and 2.2 can be obtained using the trapezoidal [5, 11] or Gear integration rule [4, 5].

2.3 Sequential and Punctual State Computation The treatment with discretized models assumes substituting the circuit elements with companion diagrams, which consist in a resistive model diagram. It allows the sequential computation of the circuit solution.

2 Sequential State Computation Using Discrete Modeling

19

2.3.1 Circuits Without Excess Elements If the given circuit does not contain capacitor loops nor inductor cutsets [7, 8], the discretization expressions associated to the energy storage elements (Table 2.1, lines 1 and 3), using the notations (2.1), one obtains S 0 ð2:3Þ xnþ1 ¼ xn þ h Xnþ1 ; 0 C where S is the diagonal matrix of capacitor elastances and C is the matrix of inductor reciprocal inductances. Starting from the companion resistive diagram, the complementary variables are obtained as output quantities [5, 10, 11] of the circuit Xnþ1 ¼ E xn þ F unþ1 ;

ð2:4Þ

where E and F are transmittance matrices, and unþ1 is the vector of input quantities [7, 8] at the moment t ¼ ðn þ 1Þh. From (2.3) and (2.4) one obtains an equation that allows computing the state vector sequentially, starting from its initial value x0 ¼ xð0Þ until the final value xw ¼ xðwhÞ: xnþ1 ¼ M xn þ N unþ1 ;

ð2:5Þ

where M¼1þh

0 E; C

S 0

ð2:6Þ

1 being the identity matrix, and N¼h

S 0

0 F: C

ð2:7Þ

Starting from Eq. 2.5, through mathematical induction, the useful formula is obtained as xn ¼ Mn x0 þ

n X

Mnk N uk ;

ð2:8Þ

i¼1

where the upper indexes of the matrix M are integer power exponents. The formula (2.8) allows the punctual computation of the state vector at any moment t ¼ nh, if the initial conditions of the circuit and the excitation quantities are known. If a particular solution xp ðtÞ of the state equation exists, it significantly simplifies the computation of the general solution xðtÞ. Using the Euler numerical integration method, one obtains [5]:

20

D. Topan and L. Mandache

xnþ1 ¼ M xn xnp þ xnþ1 p :

ð2:9Þ

The sequentially computation of the state vector implies the priory construction of the matrix E, according to Eqs. 2.6 and 2.9. This action requires analyzing an auxiliary circuit obtained by setting all independent sources to zero in the given circuit. Starting from Eq. 2.9, the expression ð2:10Þ xn ¼ Mn x0 x0p þ xnp allows the punctual computation of the state vector.

2.3.2 Circuits with Excess Elements The excess capacitor voltages [8, 11], assembled in the vector UC , as well as the excess inductor currents [5, 7, 8], assembled in the vector IL , can be expressed in terms of the state variables and excitation quantities, at the moment t ¼ nh: n 0 UC K1 0 K1 0 n ¼ x un ; þ ð2:11Þ InL 0 K2 0 K02 0

0

where the matrices K1 ; K1 and K2 ; K2 contain voltage and current ratios respectively. Using the Table 2.1, the companion diagram associated to the analyzed circuit can be obtained, whence the complementary quantities are given by: n U ð2:12Þ Xnþ1 ¼ E xn þ E1 nC þ F un ; IL the matrices E; E1 and F containing transmittance coefficients. Considering Eqs. 2.11 and 2.12, the recurrence expression is obtained from (2.5), allowing the sequential computation of the state vector: xnþ1 ¼ M xn þ N unþ1 þ N1 un ;

ð2:13Þ

where M¼1þh S N¼h 0 K1 K¼ 0

S

0

ðE þ E1 KÞ; 0 C S 0 0 E1 K0 ; F; N1 ¼ h 0 C C 0 K1 0 0 ; K0 ¼ : K2 0 K02

ð2:14Þ

2 Sequential State Computation Using Discrete Modeling

21

If xp is a particular solution of the state equation, the following identity is obtained: N unþ1 þ N1 un ¼ xnþ1 M xnp ; p

ð2:15Þ

that allows converting (2.13) in the form (2.9), as common expression for any circuit (with or without excess elements).

2.4 Example In order to exemplify the above described algorithm, let us consider the transient response of the circuit shown in Fig. 2.1, caused by turning on the switch. The circuit parameters are: R1 ¼ R2 ¼ R3 ¼ 10 X ; L ¼ 10 mH; C ¼ 100 lF; E ¼ 10V; J ¼ 1A : The time-response of capacitor voltage and inductor current will be computed for the time interval t 2 ½0; 5 ms. These quantities are the state variables too. The corresponding discretized Euler companion diagram is shown in Fig. 2.2. According to the notations used in Sect. 2.2, we have: I E u x¼ C ; X¼ C ; u¼ J iL UL The computation way of the matrices E and F arises from the particular form of the expression (2.4):

ICnþ1 ULnþ1

¼

e11 e21

f12 E f22 J

n u e12 f nC þ 11 iL e22 f21

from where: ICnþ1 ICnþ1 ; e ¼ ; 12 unC in ¼0; E¼0; J¼0 inL un ¼0; E¼0; J¼0 C L ULnþ1 ULnþ1 ¼ n ; e22 ¼ n ; uC in ¼0; E¼0; J¼0 iL un ¼0; E¼0; J¼0

e11 ¼ e21

C

L

Using the diagram of Fig. 2.2, the elements of the matrices E and F were computed, assuming a constant time step h ¼ 0:1 ms: E¼

0:1729 0:7519

0:7519 ; 9:7740

F¼

0:0827 0:8270 0:0752 0:7519

The matrices M and N given by Eqs. 2.6, 2.7 are:

22

D. Topan and L. Mandache

Fig. 2.1 Circuit example

t= 0 iL R1

uC

C

R3

E

Fig. 2.2 Discretized diagram

R2

I Cn +1 h C

R1

J

L

R2

J

iLn +1

R3

uCn +1

L h

U Ln +1

uCn

E

iLn

0.7

7.5

6.5 6

h=0.1 ms h’=0.5 ms Exact sol.

5.5 5

iL [A]

u C [V]

7

0

1

2

3

4

5

0.6

h=0.1 ms h’=0.5 ms Exact sol. 0.5 0

1

2

Time [ms]

3

4

5

Time [ms]

Fig. 2.3 Circuit response

M¼

1

0

0

1

"

þ 0:1 10

3

0

0

1 10103

" N ¼ 0:1 10

3

#

1 100106

1 100106

0

0

1 10103

E¼ #

0:8271 0:7519 0:0075

0:0827 F¼ 0:0008

0:9023

;

0:8270 : 0:0075

Starting from the obvious initial condition 0 5V u ; x0 ¼ 0C ¼ 0:5 A iL the solutions were computed using (2.8) and represented in Fig. 2.3 with solid line.

2 Sequential State Computation Using Discrete Modeling

23

The calculus was repeated in the same manner for a longer time step, 0 h ¼ 5h ¼ 0:5 ms, the solution being shown in the same figure. Both computed solutions are referred to the exact solution represented with thin dashed line.

2.5 Conclusion The proposed analysis strategy and computation formulae allow not only the punctual computation of the state vector, but also allow crossing the integration subdomains with variable time step. The proposed method harmonizes naturally with any procedure based on discrete models of analog circuits, including the methods for iterative computation of nonlinear dynamic networks. The versatility of the method has already allowed an extension, in connection to the modified nodal approach. Acknowledgments This work was supported in part by the Romanian Ministry of Education, Research and Innovation under Grant PCE 539/2008.

References 1. Topan D, Mandache L (2010) Punctual state computation using discrete modeling. Lecture notes in engineering and computer science. In: Proceedings of the world congress on engineering, vol 2184, London, June 30–July 2 2010, pp 824–828 2. Henderson A (1990) Electrical networks. Edward Arnold, London, pp 319–325 3. Topan D (1978) Computerunterstutze Berechnung von Netzwerken mit zeitdiskretisierten linearisierten Modellen. Wiss. Zeitschr. T.H. Ilmenau, pp 99–107 4. Gear C (1971) The automatic integration of ordinary differential equations. ACM 14(3):314–322 5. Topan D, Mandache L (2007) Chestiuni speciale de analiza circuitelor electrice. Universitaria, Craiova, pp 115–143 6. Topan D (1995) Iterative models of nonlinear circuits. Ann Univ Craiova Electrotech 19:44–48 7. Rohrer RA (1970) Circuit theory: an introduction to the state variable approach. Mc GrawHill, New York, pp 3–4 8. Chua LO, Lin PM (1975) Computer-aided analysis of electronic circuits–algorithms and computational techniques. Prentice-Hall, Englewood Cliffs, Chaps. 8–9 9. Chen W-K (1991) Active network analysis. World Scientific, Singapore, pp 465–470 10. Opal A (1996) Sampled data simulation of linear and nonlinear circuits. IEEE Trans Computer-Aided Des Integr Circuits Syst 15(3):295–307 11. Boite R, Neirynck J (1996) Traité d0 Electricité, vol IV: Théorie des Réseaux de Kirchhoff. Presses Polytechniques et Universitaires Romandes, Lausanne, pp 146–158

Chapter 3

Detection and Location of Acoustic and Electric Signals from Partial Discharges with an Adaptative Wavelet-Filter Denoising Jesus Rubio-Serrano, Julio E. Posada and Jose A. Garcia-Souto

Abstract The objective of this research work is the design and implementation of a post-processing algorithm or ‘‘search and localization engine’’ that will be used for the characterization of partial discharges (PD) and the location of the source in order to assess the condition of paper-oil insulation systems. The PD is measured with two acoustic sensors (ultrasonic PZT) and one electric sensor (HF ferrite). The acquired signals are conditioned with an adaptative wavelet-filter which is configured with only one parameter.

3.1 Introduction The degraded insulation is a main problem of the power equipment. The reliability of power plants can be improved by a preventive maintenance based on the condition assessment of the electrical insulation within the equipments. The insulation is degraded during the period in service due to the accumulation of mechanical, thermal and electric stresses. Partial discharges (PD) are stochastic electric phenomena that cause a large amount of small shortcoming (\500 pC) inside the insulation [1–3]. J. Rubio-Serrano (&) J. E. Posada J. A. Garcia-Souto GOTL, Department of Electronic Technology, Carlos III University of Madrid, c/Butarque 15, 28911, Leganes, Madrid, Spain e-mail: [email protected] J. E. Posada e-mail: [email protected] J. A. Garcia-Souto e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_3, Springer Science+Business Media B.V. 2011

25

26

J. Rubio-Serrano et al.

PD are present in the transformers due to the gas dissolved in the oil, the humidity and other faults. They become a problem when PD activity is persistent in time or in a localized area. These are signs of an imminent failure of the power equipment. Thus, the detection, the identification [4] and the localization of PD sources are important tools of diagnosis. This paper deals with the design of the algorithm that processes the time-series and performs the statistical analysis of the signals acquired in the framework of the MEDEPA test bench in order to assess the insulation faults. This set-up is an experimental PD generation and measurement system designed in the University Carlos III of Madrid in order to study and develop electrical and ultrasonic sensors [5] and analysis techniques, which allow the characterization and the localization of PD. A PD is an electrical fast transient which produces a localized acoustic emission (AE) due to thermal expansion of the dielectric material [3]. It also generates chemical changes, light emission, etc. [6, 7]. In this work acoustic and electrical signals are processed together. AE is characterized and both methods of detection are put together to assess the activity of PD. The electro-acoustic conversion ratio of PD can be explored by these means [8].

3.2 Experimental Set-Up The measurements are taken from the MEDEPA experimental set-up. It has the following blocks to generate different types of PD and acquire the signals (up to 100 MSps) from different sensors: 1. PD generation the experimental set-up generates controlled PD from a highvoltage AC excitation that is reliable for the ultrasonic sensor characterization and the acoustic measurements. 2. Instrumentation for electrical measurement the calibrated electrical measurement allows the correlation of generated PD and provides their basic characteristics (charge, instant of time, etc.). 3. Instrumentation for acoustic measurements ultrasonic PZT detectors are used for measuring the AE outside the tank. Fiber-optic sensors are being developed for measurements inside [9]. The experimental set-up is an oil-filled tank with immersed electrodes that generate PD. The ultrasonic sensors (R15i, 150 kHz, *1 V/Pa) are externally mounted on the tank walls. A wide-band ferrite (10 MHz) is used for electrical measurements and additional instrumentation (Techimp) provides electrical PD analysis. AE travels through the oil (1.5 mm/ls) and the PMMA wall (2.8 mm/ls) to several ultrasonic PZT sensors. The mechanical and acoustic set-up is represented in Fig. 3.1a. The internal PD generator consists of two cylindrical electrodes of 6 cm of diameter that are separated by several isolating paper layers. High-voltage AC at 50 Hz is applied

3 Detection and Location of Acoustic and Electric Signals

27

Fig. 3.1 Experimental set-up for acoustic detection and location (a). PD single event observation: electric signal and acoustic signals from sensors 1 and 2 (b)

between 4.3 and 8.7 kV, so PD are about 100 pC. The expected signals from a PD are as shown in Fig. 3.1b: a single electric signal and an acoustic signal for each channel. The delay is calculated between the electric and acoustic signals to locate spatially the PD source. Each PD single event produces an electric charge displacement of short-duration (1 ls) that is far shorter than the detected acoustic burst. The electric pulses are detected in the generation circuit. The AE signals are detected in front of the electrodes at the same height on two different walls of the tank. Several sensors are used to obtain the localization of the PD source and the electro-acoustic identification of the PD.

28

J. Rubio-Serrano et al.

Fig. 3.2 PD electro-acoustic pattern: electric pattern, acoustic patterns of channels 1 and 2

3.3 Signals Characteristics The detection of PD by electro-acoustic means has the following difficulties: the stochastic process of PD generation and the detection limits of electric and acoustic transients (signal level, identification and matching). The signals are acquired without any external synchronization due to their stochastic generation. A threshold with the AE signals is setting for assuring at least one AE detection. Afterwards, the time series are analyzed without any reference to the number of PD signals or their time-stamps. AE signals are necessary for the PD spatial location, but they are often less in number than the electric signals due to their strong attenuation caused by the propagation through the oil and the obstacles in the acoustic path. The AE detection in the experiment has the following characteristics specifically: amplitude usually below 10 mV and signal distortion due to the acoustic propagation path from the PD source to the PZT sensor. In addition, the acoustic angle of incidence to the sensor on the wall produces internal reflection and reverberation. These effects modify the shape, the energy and the power spectrum of the received signal, thus an AE from a single PD is detected differently depending on the position of the sensor. Figure 3.2 shows the characteristic transient waves at each sensor (electric and acoustic) that are associated to a single PD event. This is the electro-acoustic pattern. Though the AE signals are from the same PD their characteristics are different.

3 Detection and Location of Acoustic and Electric Signals

29

Electric signals are easily detected in this experiment. They are used as a zero time reference to calculate the acoustic time of flight from the PD source to the AE sensor. Thus, electro-acoustic processing is performed on the base of pairing the signals from different sensors and sensor types. Electric and AE signals show diverse duration: 1 ls (electric), 100 ls (AE). Multiple PD from the same or different sources can be generated in the time duration of an AE signal, so the detected AE signal can be the result of the acoustic interference of several PD events. In addition, each AE signal can be associated with more than one electric signal by using time criteria. First approach deals with a processing of the different signals independently and the statistical analysis to link them together and identify PD events [10]. In addition, an all-acoustic system of four or more channels is ongoing to locate PD events upon the basis of a multichannel processing.

3.4 Signal Processing The main objective of the algorithm is to analyze the time series in order to detect and evaluate statistically the PD activity and its characteristics: PMCC and energy of PD events, energy ratio between channels and delays. The selected processing techniques meet the following requirements [10]: (a) same processing regardless the characteristics of the signal, (b) accurate timestamp of the detected signals, (c) detection based on the shape and the energy, (d) identification tools and (e) statistical analysis for signals pairing. The signal processing is done with the following structure: pattern selection, wavelet filtering, acoustic detection, electro-acoustic pairing, PD event identification and PD localization.

3.4.1 Pattern Selection A model of PD is selected form the measurement of a single event (Fig. 3.1b). It is selected by one of these means: (a) technician’s observation of a set of signals repetitively with an expected delay, (b) the set of transients selected by amplitude criteria in each channel and (c) a previously stored PD that is useful to study the aging of the insulation. The PD pattern is the set of selected transient waves (Fig. 3.2).

3.4.2 Wavelet Filtering Signal denoising is performed by wavelet filtering that preserves the time and shape characteristics of the original PD signals [11, 12]. In addition, wavelet filters

30

J. Rubio-Serrano et al.

are self-configurable for different kind of signals by using an automatic selection rule that extract the main characteristics [13, 14]. The basic steps of the wavelet-filtering are the following: (a) transformation of the signal into the wavelet space, (b) thresholding of the wavelet components (all coefficients smaller than a certain threshold are set to zero) and (c) reverse transformation of the non-zero components. As a result the signal is obtained without undesired noise. The wavelet-transform is considered two-dimensional: in time and in scale or level of the wavelet. Each level is associated to particular frequency bands. After the n-level transformation the signal in the wavelet-space is a sum of wavelet decompositions (D) and approximations (A): n X Di ð3:1Þ signal ¼ An þ i¼1

This tool is used in combination with the Pearson product-moment correlation coefficient (PMCC or ratio) and the energy from the cross-correlation to identify which indices of Di have the main information of the pattern. The same indices are used to configure the filter that is applied to the acquired signals. PMCC is a statistical index that measures the linear dependence between two vectors X and Y. It is independent of the signal’s energy so it is used to compare the wave shape of two signals with the same length, although they were out of phase. It is defined by (3.2). Pn ðxi xÞ ðyi yÞ ð3:2Þ PMCC ¼ r ¼ Pn i¼1 P xÞ2 ni¼1 ðyi yÞ2 i¼1 ðxi The flowchart of the wavelet filtering is shown in Fig. 3.3. It is remarkable that the pattern for each channel is processed only one time. Afterwards the reconstructed signal (PATTERNw) and the configuration of the filter are obtained. Each and every acquisition is configured with these parameters. The filtered patterns (PATTERNw) and the filtered signals (SIGNALw) are the sum of their respective selected decompositions. First the pattern of each channel is filtered and conditioned and then each and every acquisition is individually processed. This wavelet filter has the following advantages: • It does not distort the waveform of the signals, so the temporal information is conserved. This is important for cross-correlation and PMCC. • It does not delay the signal. It is important for time of flight calculation. • It is self-configurable. Once the threshold is setting, the algorithm selects the decompositions that have the main frequencies of the signals.

3.4.3 Acoustic Detection Each acoustic acquisition is compared with the acoustic pattern through the crosscorrelation. Cross-correlation is used as a measure of the similarity between two signals. Moreover, the time location of each local peak matches with the starting

3 Detection and Location of Acoustic and Electric Signals

31

START

WAVELET CONFIG PARAMETERS : Daubechies ‘db 20’ n decompositions

PATTERN

WAVELET DECOMPOSITION

D1

D1'

D2

...

Dn

SIGNALS

WAVELET DECOMPOSITION

D1

An

D2

...

Dn

An

PATTERN = D1 + D2 + … + Dn + An

SIGNAL = D1 + D2 + … + Dn + An

PARAMETER ORDENATION CRITERIA

SIGNAL RECONSTRUCTION

D2'

...

Dn’

An

SIGNALw = ΣDi' (where i’ = 1' … m’)

ADD PARAMETER : ONE DECOMPOSITION FROM D 1' TO Dn’

PATTERNw = D1' + D2' + ...

END

NO

¿PATTERNw MEET PARAMETER CONDITIONS?

YES at order m’

FILTER DECOMPOSITION CONFIGURATION : i’ = 1' … m’

PATTERN RECONSTRUCTION PATTERNw = ΣDi' (where i’ = 1' … m’)

END

Fig. 3.3 Wavelet filtering flowchart

instant of a transient similar to the pattern. The value of the peak is also a good estimator of the similarity of the signals. Cross-correlation is used as a search engine to detect the transients that are the best candidates of coming from a PD. It is also used to associate the time-stamp to each one.

32

J. Rubio-Serrano et al.

The algorithm analyzes the peaks of the cross-correlation in order to decide if the detected transient satisfies the minimum requirements of the selected parameters (energy, amplitude, PMCC, etc.). A maximum of four transients per acoustic acquisition are stored for statistical analysis.

3.4.4 Electro-Acoustic Signal Association Next step is the cross-correlation of the electric signals with the pattern. Though it is based on the same tool, some differences are introduced. In this case, the maximum absolute value of the cross-correlation is searched. Positive and negative peaks of the cross-correlation are detected and they are associated to the instantaneous phase of the power line voltage, which is an additional parameter for identification. The electric signals are searched in a temporal window that is compatible with the detected acoustic signal (3.3). Thus, the search within the electric acquisition is delimited between the time-stamp of the acoustic signal and a time period before. This temporal window corresponds with the time that the AE takes to cross the tank. In the experiment of Fig. 3.1 the length of the temporal window is 350 ls by considering *1.5 mm/ls of sound-speed in oil and 500 mm of the length of the tank. dist:tank ; tstart ðacous sigÞ ð3:3Þ tstart ðelec sigÞ 2 tstart ðacous sigÞ vsound Each acoustic signal is matched with up to four electrical signals that satisfy Eq. 3.3 and the database of PD parameters is obtained. Afterwards, the presentation tool provides the histogram of the delay between paired signals in order to analyze the persistency of some delay values. These values with higher incidence correspond to a fault in the insulation. This process of acoustic detection and electro-acoustic signal association is implemented separately for each acoustic channel. The data obtained for each acoustic channel is independent from the others in this approach.

3.4.5 PD Event Association The association of the transients detected in the acoustic and electric channels provides sets of related signals that come from single PD events with certain probability. Hence, each PD event is defined with three signals: the electric signals that are associated to both acoustic channels and the corresponding acoustic signals. As a result, each PD event contains an electric time-stamp that is the zero time reference and the time of flight of each acoustic signal. These parameters and the references of association are stored in a database as structured information that is used for the statistical analysis.

3 Detection and Location of Acoustic and Electric Signals

33

3.4.6 Localization of the PD Events Once the database is generated all the PD events are analyzed in order to assess the condition of the insulation. The fault inside the insulation is identified by the persistency of PD events and located acoustically. The localization is made in the plane which contains the acoustic sensors and the paper between electrodes. PD are generated in this region. Reduced to this 2-D case, Eq. 3.4 is used as a simple localization tool. ðxPD xS1 Þ2 þ ðyPD yS1 Þ2 ¼ ðvsound TS1 Þ2 ðxPD xS2 Þ2 þ ðyPD yS2 Þ2 ¼ ðvsound TS2 Þ2

ð3:4Þ

Where (xS1, yS1) y (xS2, yS2) are the coordinates of sensors 1 and 2, respectively, and TS1 and TS2 are the time of flight of the acoustic signals from sensors 1 and 2, respectively. Equation 3.4 represents the intersection of two circumferences whose centers are located in the position of sensors 1 and 2. When all the PD events are localized and represented, the cluster of PD from the same region is statistically studied in order to find the parameters dependence between acoustic and electric measurements. These PD were probably generated in the same insulation fault so their acoustic path, attenuation and other variables involved in the acoustic detection should be identical. The PD events of the same cluster are analyzed against the lonely PD events. This study delimitates the range of values for a valid PD event. The persistency and the concentration of PD activity are the symptoms of the degradation of the insulation system. Hence, thought lonely PD events can be valid, they are no relevant for the detection of faults inside the insulation.

3.5 Experimental Results The proposed algorithm was applied to process the acquisitions that were taken on MEPEPA test-bench (Fig. 3.1a). In this experiment 76 series of acoustic and electric signals were acquired simultaneously. Each time series is approximately 8 ms and it is sampled at 100 MSps. First, the electric and acoustic patterns were selected from an isolated PD event (Fig. 3.1b) and they are filtered with the wavelet processing (Fig. 3.3). The electric pattern is a fast transient of about 7 MHz and its duration is 1 ls approximately. The length of the acoustic pattern is 35 ls and its central frequency is 150 kHz. A detail of the signals involved in the wavelet filtering to obtain the acoustic pattern of one sensor is shown in Fig. 3.4. A limitation of the selected pattern is the reverberation of the acoustic waves that is detected through the wall. For normal incidence of the acoustic signal on the PMMA wall (sound velocity of 2.8 mm/ls) the reflection takes 7 ls to reach the

34

J. Rubio-Serrano et al.

Fig. 3.4 Acoustic pattern of sensor 1 after reconstruction and its decomposition

detector again (20 mm). Thus the distortion of the acoustic signal is observed from 7 ls onwards. Once the patterns are selected and filtered each and every acquired signal series is filtered, it is processed with the cross correlation and analyzed with the PMCC. As a result the local peaks of the cross-correlation give the time-stamps that can be associated to PD events. These events are also characterized by their indexes and the transient waveforms that were found in the time series (Fig. 3.5).

Fig. 3.5 Electric and acoustic time-series. Sets of transient signals found as probable PD events

3 Detection and Location of Acoustic and Electric Signals

35

Fig. 3.6 Example of detected PD event (details of AE signal in sensor 1 and electric signal)

Acoustic and electric signals are matched together and the parameters of each signal are calculated. In Fig. 3.6 there is an example of one of the paired-signals (acoustic sensor 1 and electric sensor 3) found by the algorithm. The parameters of each transient signal and of the pair are also shown. Now, signals can be classified by their delays. In the experiment, there are some valid values with an incidence of four or more. The delay of maximum incidence is 102 ls (Fig. 3.7) for sensor 1 and 62 ls for sensor 2. In future work, it will be examined the relation between energy and the PMCC as a function of their location. The goal is to find and discriminate PD not only for its location but also for its expected values of energy and PMCC. Finally, Eq. 3.4 is employed to locate the origin of the acoustic signals in the plane (Fig. 3.8). It is important to emphasize that lonely PD events are observed and they can be valid events. However, they are no relevant for the detection of faults inside the insulation because their low persistency and their location are not characteristic for the insulation condition assessment. Figure 3.8 shows the existence of a region inside the electrodes where PD were frequently generated. This region represents a damaged area in the insulation. The

36

J. Rubio-Serrano et al.

Fig. 3.7 Histogram of delays obtained from the acoustic sensor 1 with the selected PD

Fig. 3.8 Location of the detected PD events in the plane

concentration of PD in this region is a symptom of an imminent failure of the electrical system.

3.6 Conclusions and Future Work The design and implementation of a post-processing algorithm is presented for the detection and location of partial discharges and the condition assessment of the insulation. The algorithm is able to parameterize the signals, to define the ranges and to delimitate the time windows in order to locate and classify PD in transformers. It was applied to signals from internal PD that were acquired by external

3 Detection and Location of Acoustic and Electric Signals

37

acoustic sensors, but it is being extended to superficial PD and internal acoustic fiber-optic sensors. The purpose of this signal processing within the framework of the MEDEPA test-bed is to locate, identify and parameterize PD activity to predict imminent failures in insulation systems. The main features of the proposed algorithm are the following: its feasibility to detect and identify PD signals from different sensors, the adaptability of the wavelet filtering based on an external pattern, and the multi-sensor statistical analysis instead of a single event approach. In addition, the wavelet denoising does not alter the temporal characteristics. Although the algorithm is not still designed for real-time use, after the temporal series are processed, their parameters are stored in a database, which is used as a reference of PD activity for further studies and extended to the maintenance of transformers in service. In order to improve the signal detection and identification, the next step is the calibration of the tool with different kind of PD activities (types, intensities and sources) and the statistical analysis of the parameters in the database. In addition to the time windowing and the pattern-matching, other parameters will be considered to assess the probability of PD: persistency, 3-D location, energy and PMCC. The location of PD sources is of main concern in the application of AE. The objective is to implement a 3-D algorithm compatible with the designed tools, either with external sensors, or using also internal sensors [9]. An all-acoustic system of four or more channels is ongoing to locate PD events upon the basis of a multi-channel processing. Finally, the electro-acoustic conversion ratio of PD activity is an open research line with the implemented statistical analysis. Acknowledgments This work was supported by the Spanish Ministry of Science and Innovation, under the R&D projects No. DPI2006-15625-C03-01 and DPI2009-14628-C03-01 and the Research grant No. BES-2007-17322. PD tests have been made in collaboration with the High Voltage Research and Tests Laboratory of Universidad Carlos III de Madrid (LINEALT).

References 1. Bartnikas R (2002) Partial discharges: their mechanism, detection and measurement. IEEE Trans Dielectr Electr Insul 9(5):763–808 2. Van Brunt RJ (1991) Stochastic properties of partial discharges phenomena. IEEE Trans Electr Insul 26(5):902–948 3. Lundgaard LE (1992) Partial discharge—part XIV: acoustic partial discharge detection— practical application. IEEE Electr Insul Mag 8(5):34–43 4. Suresh SDR, Usa S (2010) Cluster classification of partial discharges in oil-impregnated paper insulation. Adv Electr Comput Eng J 10(5):90–93 5. Macia-Sanahuja C, Lamela H, Rubio J, Gallego D, Posada JE, Garcia-Souto JA (2008) Acoustic detection of partial discharges with an optical fiber interferometric sensor. In: IMEKO TC 2 Symposium on Photonics in Measurements 6. IEEE Guide for the Detection and Location of Acoustic Emissions from Partial Discharges in Oil-Immersed Power Transformers and Reactors (2007). IEEE Power Engineering Society

38

J. Rubio-Serrano et al.

7. Santosh Kumar A, Gupta RP, Udayakumar K, Venkatasami A (2008) Online partial discharge detection and location techniques for condition monitoring of power transformers: a review. In: International Conference on Condition Monitoring and Diagnosis, Beijing, China, 21–24 April 2008 8. von Glahn P, Stricklett KL, Van Brunt RJ, Cheim LAV (1996) Correlations between electrical and acoustic detection of partial discharge in liquids and implications for continuous data recording. Electr Insul 1:69–74 9. Garcia-Souto JA, Posada JE, Rubio-Serrano J (2010) All-fiber intrinsic sensor of partial discharge acoustic emission with electronic resonance at 150 kHz. SPIE Proc 7726:7 10. Rubio-Serrano J, Posada JE, Garcia-Souto JA (2010) Digital signal processing for the detection and location of acoustic and electric signals from partial discharges. In: Proceedings of the World Congress on Engineering 2010, WCE 2010, 30 June–2 July 2010, London, UK, vol 2184, issue 1, pp 967–972 11. Ma X, Zhou C, Kemp IJ (2002) Automated wavelet selection and thresholding for PD detection. IEEE Electr Insul Mag 18(2):37–45 12. Keppel G, Zedeck S (1989) Data analysis for research designs—analysis of variance and multiple regression/correlation approaches. Freeman, New York 13. Wang K-C (2009) Wavelet-based speech enhancement using time-frequency adaptation. EURASIP J Adv Signal Process 2009:8 14. Pinle Q, Yan L, Ming C (2008) Empirical mode decomposition method based on wavelet with translation invariance. EURASIP J Adv Signal Process 2008:6

Chapter 4

Study on a Wind Turbine in Hybrid Connection with a Energy Storage System Hao Sun, Jihong Wang, Shen Guo and Xing Luo

Abstract Wind energy has been focused as an inexhaustible and abundant energy source for electrical power generation and its penetration level has increased dramatically worldwide in recent years. However, its intermittence nature is still a universally faced challenge. As a possible solution, energy storage technology hybrid with renewable power generation process is considered as one of options in recent years. The paper aims to study and compare two feasible energy storage means—compressed air (CAES) and electrochemical energy storage (ECES) for wind power generation applications. A novel CAES structure in hybrid connection with a small power scale wind turbine is proposed. The mathematical model for the hybrid wind turbine system is developed and the simulation study of system dynamics is given. Also, a pneumatic power compensation control strategy is reported to achieve acceptable power output quality and smooth mechanical connection transition.

H. Sun J. Wang (&) S. Guo X. Luo School of Electronic, Electrical and Computer Engineering, University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK e-mail: [email protected] H. Sun e-mail: [email protected] S. Guo e-mail: [email protected] X. Luo e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_4, Ó Springer Science+Business Media B.V. 2011

39

40

H. Sun et al.

4.1 Introduction Nowadays, the world is facing the challenge to meet the continuously increasing energy demand and to reduce the harmful impact to our environment. In particular, wind energy appears as a preferable solution to take a considerable portion of the generation market, especially in the UK. However, the key challenge faced by wind power generations is intermittency. The variability of wind power can lead to changes in power output from hour to hour, which arises from changes in wind speed. Figure 4.1 shows that the power output from a diversified wind power system is usually changing hourly from 5 to 20%, either higher or lower [1]. Besides, energy regulatory policies all around the world have been characterized by introducing competition in the power industry and market, both at the wholesale and at the retail levels. The variable market brought uncertain variations onto power transmission and distribution networks, which have been studied at length [2, 3]. It is highly desired to alleviate such impacts through alternative technologies. One proposed solution is to introduce an element of storage or an alternative supply for use when the ambient flux is insufficient for a guaranteed supply to the demand. The primary cause is that energy storage can make wind power available when it is most demanded. Apart from the pumped water, battery, hydrogen and super-capacitors, compressed air energy storage (CAES) is also a well known controllable and affordable technology of energy storage [4–6]. In a CAES system, the excess power is used to compress air which can be stored in a vessel or a cavern. The energy stored in compressed air can be used to generate electricity when required. Compared with other types of energy storage schemes, CAES is sustainable and will not produce any chemical waste. In this paper, a comparative analysis between CAES and electrochemical energy storage (ECES) has been conducted. A hybrid energy storage wind turbine system is proposed in the paper, which connects a typical wind turbine and vane-type air motor for compressed air energy conversion. The mathematical model for the whole system is derived and simulation study is conducted. The study of such a CAES system has shown a promising merit provided by the proposed hybrid connection of wind turbines and CAES. Fig. 4.1 The hourly change of wind power output

4 Study on a Wind Turbine in Hybrid Connection

41

4.2 Electrochemical and Compressed Air Energy Storage In this paper, the feasibility of energy storage for 2 kW household small scale wind turbine is analyzed. Electrochemical energy storage is the most popular type of energy storage in the world from small to large scales. For instance, the lead-acid battery is the oldest rechargeable battery with widest range of applications, which is a mature and cost-effective choice among all the electrochemical batteries. The main advantages of ECES are no emission, simple operation and higher energy efficiency. The efficiency of lead-acid batteries is generally around 80%. While, the compressed air energy storage is also cleaner as no chemical disposal pollution is produced to environment [7]. However, CAES has rather lower energy efficiency; much energy is lost during the process of thermal energy conversion [8–10]. A drawback ECES faced is relatively short lifetime that mainly expressed on the limited charge/discharge cycle life. For example, lead-acid batteries’ cycle life is roughly in the range of 500–1500. This issue can be more serious when it is applied to wind power generation due to the high variation in wind speed and low predictability to the wind power variation patterns, that is, the battery will be frequently charged and discharged. For CAES, the pneumatic actuators, including compressor, air motor, tank, pipes and valves, are relatively robust; the major components have up to 50-year lifetime. Therefore, the whole system lift time would be only determined by the majority of the mechanical components in the system. The capacity of an electrochemical battery is directly related to the active material in the battery. That means the more energy the battery can offer, the more active material will be contained in the battery, and therefore the size, weight as well as the price is almost linear versus the battery capacity. For the compressed air system, the capacity correlates to the volume of the air storage tank. Even though the pneumatic system also requires large space to sustain a long term operation, but it has been proven more cost-effective in consideration of the practically free raw material (see Table 4.1 [11]). The electromotive force of a lead-acid cell provides only about two Volts voltage due to its electrochemical characteristics, and enormous amount of cells therefore should be connected in series to obtain a higher terminal voltage. With this series connection, if one cell within the battery system goes wrong, the whole battery may fail to store or offer energy in the manner desired. Discouragingly, it is very hard currently to diagnose which cell in the system fails and it is expensive and not cost-effective to replace the whole pack of batteries. Besides, most leadacid batteries designed for the deep discharge are not sealed, and the regular maintenance is therefore required due to the gas emission caused by the water

Table 4.1 Typical marginal energy storage costs

Types

Overall cost

Electro-chemical storage Pumped storage Compressed air

[$400/KWh $80/KWh $1/KWh

42

H. Sun et al.

Table 4.2 Comparison between CAES and ECES CAES ECES Service life Efficiency Size Overall cost Maintenance

Long Not high Large depend on tank size Very cheap Need regular maintenance

Short Very high Large depend on cell number Very expensive Hard to overhaul, need regular maintenance

electrolysis while overcharged. Comparing with these characteristics of batteries, CAES only needs regular leakage test and oil maintenance. In brief, a comparison between CAES and ECES can be summarized in Table 4.2.

4.3 The Hybrid Wind Turbine System with CAES There are two possible system structures for a hybrid wind turbine system with compressed air energy storage; one has been demonstrated as an economically solution for utility-scale energy storage on the hours’ timescale. The energy storage system diagram is illustrated in Fig. 4.2. Such systems are successfully implemented in Hantorf in Germany, McIntosh in Alabama, Norton in Ohio, a municipality in Iowa, in Japan and under construction in Israel [12]. The CAES produces power by storing energy in the form of compressed air in an underground cavern. Air is compressed during off-peak periods, and is used on compensating the variation of the demand during the peak periods to generate power with a turbo-generator/gas turbine system. However, this system seems to be disadvantageous as it needs a large space to store compressed air, such as large underground carven for large scale power facilities. So this may limit its applications in terms of site installation. Besides all the above mentioned issues, large-capacity converter and inverter systems are neither cost effective nor power effective. For smaller capacity of wind turbines, this paper presents a novel hybrid technology to engage energy storage to wind power generation. As shown in Fig. 4.3, the electrical and pneumatic parts are connected through a mechanical transmission mechanism. This electromechanical integration offers simplicity of design, therefore, to ensure a higher efficiency and price quality. Also, the direct compensation of torque variation of the wind turbine will alleviate the stress imposed onto the wind turbine mechanical parts.

4.4 Modelling Study of the Hybrid Wind Turbine System For the proposed system illustrated in Fig. 4.3, the detailed mathematical model has been derived, which is used to have an initial test for the practicability of the whole hybrid system concept. At this stage, the system is designed to include a

4 Study on a Wind Turbine in Hybrid Connection

43

Fig. 4.2 Utility-scale CAES application’s diagram

Fig. 4.3 Small scale hybrid wind turbine with CAES

typical wind turbine with a permanent magnetic synchronous generator (PMSG), a vane type air motor and the associated mechanical power transmission system. The pneumatic system can be triggered to drive the turbine for power compensation during the low wind power period. The whole system mathematical model is developed and described below.

4.4.1 Mathematical Model of the Wind Turbine For a horizontal axis wind turbine, the mechanical power output P that can be produced by the turbine at the steady state is given by: 1 P ¼ qprT2 v3w Cp 2

ð4:1Þ

where q is the air density, vw is the wind speed, rT is the blade radius; Cp reveals the capability of turbine for converting energy from wind. This coefficient depends on the tip speed ratio k ¼ xT rT =vw and the blade angle, xT denotes the turbine speed. As this requires knowledge of aerodynamics and the computations are rather complicated, numerical approximations have been developed [13, 14]. Here the following function will be used, 12:5 116 0:4h 5 e ki ð4:2Þ Cp ðk; hÞ ¼ 0:22 ki

44

H. Sun et al.

with 1 1 0:035 ¼ ki k þ 0:08h h3 þ 1

ð4:3Þ

To describe the impact of the dynamic behaviors of the wind turbine, a simplified drive train model is considered. d 1 xT ¼ ðTT TL BxT Þ dt JT

ð4:4Þ

Where JT is the inertia of turbine blades, TT and TL mean the torque of turbine and low speed shaft respectively, B is the damping coefficient of the driven train system.

4.4.2 Modeling the Permanent Magnetic Synchronous Generator (PMSG) The model of a PMSG with pure resistance load (for simplicity of analysis) is formed of the following equations. For the mechanical part: d 1 xG ¼ ðTG Te FxG Þ dt JG

ð4:5Þ

dhG ¼ xG dt

ð4:6Þ

d 1 Rs Lq id ¼ vd id þ pxG iq Ld Ld dt Ld

ð4:7Þ

For the electrical part:

d 1 Rs Ld epxG iq ¼ vq iq pxG id dt Lq Lq Lq Lq Te ¼ 1:5p eiq þ ðLd Lq Þid iq i pﬃﬃﬃ 1h sinðphG Þ ð2Vab þ Vbc Þ þ 3Vbc cosðphG Þ 3 i pﬃﬃﬃ 1h Vd ¼ cosðphG Þ ð2Vab Vbc Þ 3Vbc sinðphG Þ 3 Vq ¼

ð4:8Þ ð4:9Þ ð4:10Þ ð4:11Þ

where, hG and xG are generator rotating angle and speed, F means the combined viscous friction of rotor and load, i is current, v means voltage, L is inductance, Rs is resistance of stator windings, p is the number of pole pairs of the generator,

4 Study on a Wind Turbine in Hybrid Connection

45

Fig. 4.4 Structure of a vane type air motor with four vanes

e is the amplitude of the flux induced by the permanent magnets of the rotor in the stator phases. While the subscripts a; b; c; d; q represent the axes of a; b; c; d; q for different electrical phases, respectively. The three-phase coordinates and d–q rotating frame coordinates can be transformed each other through Park’s transformation [15].

4.4.3 Model of the Vane-Type Air Motor Figure 4.4 shows the sketch of a vane-type air motor with four vanes. In this paper, input port 1 is supposed to be the inlet port, and then input port 1 will be outlet port. Compressed air is admitted through the input port 1 from servo valves and fills the cavity between the vanes, housing and rotor. The chamber A which is open to the input port 1 fills up under high pressure. Once the port is closed by the moving vane, the air expands to a lower pressure in a higher volume between the vane and the preceding vane, at which point the air is released via the input port 2. The difference in air pressure acting on the vane results in a torque acting on the rotor shaft [16, 17]. A simplified vane motor structure is shown in Fig. 4.5. The vane working radius measured from the rotor centre xa can be derived by: qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ xa ¼ e cos / þ R2m e2 sin2 / ð4:12Þ The volumes of chamber A and chamber B are derived as follows, and presented by the subscription a and b in this part equations. 1 1 Va ¼ L R2m r2 ðp þ /Þ þ Lm e2 sin 2u þ Lm eRm sin u 2 4

ð4:13Þ

1 1 Vb ¼ Lm R2m r2 ðp /Þ Lm e2 sin 2u Lm eRm sin up 2 4

ð4:14Þ

46

H. Sun et al.

Fig. 4.5 Schematic diagram of the structure of a vane-type air motor

Chamber B

Chamber A

r Rm

Xa 2e Φ

where, Rm is radius of motor body; e is eccentricity; Lm is vane active length in the axial direction, / is motor rotating angle, r means rotor radius. The pressure of chamber A and B can be derived [10]: kV_ a k P_ a ¼ Pa þ RTs Cd C0 Aa Xa f ðPa ; Ps ; Pe Þ Va Va

ð4:15Þ

kV_ b k P_ b ¼ Pb þ RTs Cd C0 Ab Xb f ðPb ; Ps ; Pe Þ Vb Vb

ð4:16Þ

where, Ts is supply temperature, R, Cd, C0 are air constant, A is effective port width of control valve, X is valve spool displacement, f is a function of the ratio between the downstream and upstream pressures at the orifice. The drive torque is determined by the difference of the torque acting on the vane between the drive and exhaust chambers, and is given by [13]: ð4:17Þ M ¼ ðPa Pb Þ e2 cos 2u þ 2eRm cos u þ R2m r2 L=2

4.4.4 Model of Mechanical Power Transmission The power transmission system, which is similar to a vehicle air conditioning system, includes the clutch and the belt speed transmission to ensure coaxial running, as shown in Fig. 4.6 [18]. The clutch will be engaged only when the turbine and air motor operate at the same speed to avoid mechanical damage to the system components. Even so, the system design still faces another challenge during the engagement, that is, the speed of air motor could not reach the speed as high as the turbine generator does, in most instances. Therefore, the two plates of belt transmission are designed in different diameters to play the function as a gearbox does. The main issue of modeling the power transmission is that two different configurations are presented: Case I Clutch disengaged: After the air motor started during the period before the two sides of electromagnetic clutch get the same speed, the clutch can be

4 Study on a Wind Turbine in Hybrid Connection

47

Fig. 4.6 The structure of the power transmission system in hybrid wind turbine

considered completely separated. While the scroll air motor is at the idle status with the inertia load of clutch friction plate. Considering friction and different payloads and applying Newton’s second law of angular motion, we have € M Mf /_ ¼ ðJa þ Jf Þ/

ð4:18Þ

where Ja is the air motor inertia, Jf is friction plate inertia, M is the drive torque, € represents the angular Mf is the friction coefficient, /_ is the angular velocity, / acceleration. Both the active plate and passive plate of the belt transmission can be considered as the generator inertia load, so the total equivalent inertia is Jtotal ¼ Jpass þ j2 Jact

ð4:19Þ

where Jpass and Jact is the inertia of passive and active plate respectively, and j is the speed ratio of the belt. Case II Clutch engaged: Once the angular velocity of the air motor /_ meets the speed of the active plate xG =i, the clutch will be engaged with the two sides. After the engagement, the active plate and friction plate can be assumed together to be one mass. The dynamic equations are as follows: 8 € > M Mf /_ Tact ¼ ðJa þ Jf þ Jact Þ/ > > > > > Tact g > > > < Tpass ¼ j dxG 1 > ¼ ðTH þ Tpass Te FxG Þ > > > dt JG þ Jpass > > > > x > : /_ ¼ G j where, TH is the input torque of wind turbine high speed shaft, g is the transfer efficiency of the belt. Choose system state variables to be x1 : pressure in the chamber A, x2 : pressure in the chamber B, x3 : rotated angle, x4 : angular speed, x5 : current in d axis,

48

H. Sun et al.

x6 : current in q axis. And input variables u1 : wind speed, u2 : input valve displacement. Combining the wind turbine, driven train and generator models together, the state functions of the whole hybrid wind turbine system can then be described by: kV_ a k x1 þ RTs Cd C0 Aa u2 f ðPa ; Ps ; Pe Þ Va Va kV_ b k x_ 2 ¼ x2 þ RTs Cd C0 Ab Xb f ðPb ; Ps ; Pe Þ Vb Vb x4 x_ 3 ¼ j ( 0 0 qpr2 u31 Cp 1 B x4 x_ 4 ¼ g g 02 0 2x5 j JG þ Jpass þ JT g0 þ ðJa þ Jf þ Jact Þ g2 x_ 1 ¼

j2

j

x 3 M Mf x 4 4 þ g g 2 Mc S pðex6 þ Ld x6 x5 Lq x6 x5 Þ Fx4 j j j 2

)

vd R s Lq x5 þ px4 x6 Ld Ld Ld vq Rs Ld epx5 x_ 6 ¼ x6 px4 x5 Lq L q Lq Lq x_ 5 ¼

where, g0 , j0 is the efficiency and speed ratio of wind turbine gearbox. With such a complicated structure of the system model, sometimes, it is difficult to obtain accurate values of system parameters. Intelligent optimization and identification methods have been proved to be an effective method to tackle this challenging problem [19, 20]. The test system for the proposed hybrid system structure is under development in the authors’ laboratory and the data obtained from the rig can be used to improve the model accuracy.

4.5 Simulation Study The model derived above for the proposed hybrid wind turbine system is implemented in MATLAB/SIMULINK environment to observe the dynamic behavior of the whole system as shown in Fig. 4.7. The simulation results are described below. The simulation considers the scenario when the input wind speed steps down within a 40 s’ time series observation window, that is, drops from 9 to 8 m/s at the time of 20 s and the whole simulation time period is 40 s (see Fig. 4.8). For comparison, the results from hybrid system using 6 bar supply pressure and those from stand-alone system without pneumatic actuators are shown in Fig. 4.9. It can be seen that the hybrid system can still obtain a high turbine speed due to the contribution of air motor output. It can also maintain a steady value even the natural wind speed decreases. Regrettably however, the power coefficient of

4 Study on a Wind Turbine in Hybrid Connection

49

Wind speed (m/s)

Fig. 4.7 The block diagram of the simulation system 10 9 8 7 6 5

0

5

10

15

20

25

30

35

40

Time(s)

Power coefficient

Fig. 4.9 Simulation results of wind turbine

Turbine speed (rad/s)

Fig. 4.8 Input wind speed

50 40 30 Hybrid

20

Stand-alone

10 0

0

5

10

15

20 25 Time (s)

30

35

40

0.6 0.4 Hybrid

0.2

Stand-alone

0

0

5

10

15

20

25

30

35

40

Time (s)

turbine falls because of the increased tip speed ratio k ¼ xT rT =vw . That should be considered as adverse effect of the hybrid system. Figure 4.10 provides a significant contrast between hybrid and independent status through generator operation. It can be seen that the power compensation can almost overcome the energy shortfall at the lower wind speed. Figure 4.11 reveals the simulation results of vane type air motor. The air motor started at the time of 20 s, and joined the wind turbine system rapidly owing to its fast response characteristic. It is worth noting that this type of air motor should

Fig. 4.10 Simulation results of the responses of the PMSG

H. Sun et al. Generator speed (rad/s)

50

250 200 150 100 50 0

Hybrid Stand-alone

0

5

10

15

20

25

30

35

40

Generator power (watt)

Time (s) 2500 2000 1500 1000 500 0

Hybrid Stand-alone

0

5

10

15

20

25

30

35

40

35

40

Fig. 4.11 Simulation results of vane type air motor

Air motor speed (rad/s)

Time (s)

100 50 0

0

5

10

15

20

25

30

Chamber pressure (pascal)

Time (s) 5

6

x 10

4

Pa Pb

2 0

0

5

10

15

20

25

30

35

40

Time (s)

generally running with well-marked periodic fluctuation, which is originated from the cyclically changed difference between Pa and Pb (the pressures in chamber A and chamber B). However, in hybrid system, the air motor operates rather smoothly which may be resulted from the large inertia of the whole system.

4.6 Concluding Remarks This paper presents a concise review on two types of energy storage technologies. A new concept of CAES applied to a small power scale wind turbine system is introduced. The complete process mathematical model is derived and implemented under MATLAB/SIMULINK environment. The simulation results are very encouraging as the extra power from the air motor output compensates the power shortfall from wind energy. This strategy enables the wind turbine to operate at a

4 Study on a Wind Turbine in Hybrid Connection

51

relatively uniformly distributed speed profile, which in turn will improve the operation condition of the overall system. The simple structure of the system and the advantage of CAES would provide the opportunities for such a system to be placed in the future renewable energy electricity market. The research in hybrid wind turbines is still on-going and further improvement is expected. Advanced tracking control strategy is a promising methodology and currently in consideration by the research team [7, 21]. Acknowledgments The authors would like to thank the support from ERDA/AWM for the support of Birmingham Science City Energy Efficiency & Demand Reduction project, China 863 Project (2009AA05Z212) and the scholarships for Hao Sun, Xing Luo from the University of Birmingham, UK.

References 1. Sinden G (2005) Wind power and UK wind resource. Environmental change institute, University of Oxford 2. Akhmatov V (2002) Variable-speed wind turbines with doubly-fed induction generators. Part II: power system stability. Wind Eng 26(3):71–88 3. Hansena AD, Michalke G (2007) Fault ride-through capability of DFIG wind turbines. Renew Energy 32:1594–1610 4. Cavallo A (2007) Controllable and affordable utility-scale electricity from intermittent wind resources and compressed air energy storage (CAES). Energy 32:120–127 5. Lemofouet S, Rufer A (2006) A hybrid energy storage system based on compressed air and supercapacitors with maximum efficiency point tracking (MEPT). IEEE Trans Ind Electron 53(4):1105–1115 6. Van der Linden S (2006) Bulk energy storage potential in the USA, current developments and future prospects. Energy 31:3446–3457 7. Wang J, Pu J, Moore P (1999) Accurate position control of servo pneumatic actuator systems: an application to food packaging. Control Eng Pract 7(6):699–706 8. Yang L, Wang J, Lu N et al (2007) Energy efficiency analysis of a scroll-type air motor based on a simplified mathematical model. In: The Proceedings of the World Congress on Engineering. London, pp 759–764, 2–4 July 2007 9. Wang J, Yang L, Luo X, Mangan S, Derby JW (2010) Mathematical modelling study of scroll air motors and energy efficiency analysis—Part I. IEEE-ASME Transactions on Mechatronics. doi:10.1109/TMECH.2009.2036608 10. Wang J, Luo X, Yang L, Shpanin L, Jia N, Mangan S, Derby JW (2010) Mathematical modelling study of scroll air motors and energy efficiency analysis—Part II. IEEE-ASME Transactions on Mechatronics. doi:10.1109/TMECH.2009.2036607 11. Price A (2009) The current status of electrical energy storage systems. ESA London Meeting, London, UK 12. Vongmanee V (2009) The renewable energy applications for uninterruptible power supply based on compressed air energy storage system. In: IEEE Symposium on Industrial Electronics and Applications (ISIEA 2009). Kuala Lumpur, Malaysia, 4–6 October 2009 13. Heier S (1998) Grid integration of wind energy conversion systems. Wiley, Chicheste 14. Sun H, Wang J, Guo S, Luo X (2010) Study on energy storage hybrid wind power generation systems. In: The Proceedings of the World Congress on Engineering 2010 WCE 2010, Vol II, London, UK, June 30–July 2, pp 833–838

52

H. Sun et al.

15. Pillay P, Krishnan R (1989) Modeling, simulation and analysis of permanent magnet motor drives, part 1: the permanent-magnet synchronous motor drive. IEEE Trans Ind Appl 25:265–273 16. Luo X, Wang J, Shpanin L, Jia N, Liu G, Zinober A (2008) Development of a mathematical model for vane-type air motors with arbitrary N vanes. In: International Conference of Applied and Engineering Mathematics, WCE, vol I–II. London, pp 362–367, July 2008 17. Wang J, Pu J, Moore PR, Zhang Z (1998) Modelling study and servo-control of air motor systems. Int J Control 71(3):459–476 18. Yeung YPB, Cheng KWE, Chan WW, Lam CY, Choi WF, Ng TW (2009) Automobile hybrid air conditioning technology. In: The Proceedings of the 3rd International Conference on Power Electronics Systems and Applications, p 116 19. Wei JL, Wang J, Wu QH (2007) Development of a multi-segment coal mill model using an evolutionary computation technique. IEEE Trans Energy Convers 22:718–727 20. Zhang YG, Wu QH, Wang J, Oluwanda G, Matts D, Zhou XX (2002) Coal mill modelling by machine learning based on on-site measurement. IEEE Trans Energy Convers 17(4):549–555 21. Wang J, Kotta U, Ke J (2007) Tracking control of nonlinear pneumatic actuator systems using static state feedback linearization of the input-output map. In: Proceedings of the Estonian Academy of Sciences-Physics Mathematics, vol 56, pp 47–66

Chapter 5

SAR Values in a Homogenous Human Head Model Levent Seyfi and Ercan Yaldız

Abstract The purpose of this chapter is to present how to determine and reduce specific absorption rate (SAR) on mobile phone user. Both experimental measurement technique and a numerical computing method are expressed here. Furthermore, an application on reduction of SAR value induced in human head is carried out with numerical computing. Mobile phone working at 900 MHz frequency shielded with copper is considered in order to furnish reduction of SAR in simulations which are conducted to calculate the maximum SAR values in Matlab programming language using two dimensional (2D) Finite Difference Time Domain (FDTD) method. Calculations are separately made for both 1 g and 10 g. Head model structure is assumed uniform.

5.1 Introduction Today mobile phone is one of the most widely used electronic equipments. What is more, it has a large number of users regardless of age. For this reason, designing of mobile phones which do not adversely affect human health is of great importance. The mobile phones are used mostly very close to ear as shown in Fig. 5.1. In this case, electromagnetic (EM) wave of mobile phone mainly radiates towards user’s head (that is, brain).

L. Seyfi (&) E. Yaldız Department of Electrical and Electronics Engineering, Selçuk University, Konya, Turkey e-mail: [email protected] E. Yaldız e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_5, Ó Springer Science+Business Media B.V. 2011

53

54

L. Seyfi and E. Yaldız

Fig. 5.1 Distribution of EM waves from a mobile phone on human head

Mobile phones communicate by transmitting radio frequency (RF) waves through base stations. RF waves are non-ionizing radiation which cannot break chemical bonds nor cause ionization in the human body. The operating frequencies of mobile phones can change depending on the country and the service provider between 450 and 2700 MHz. The RF radiation to a user mitigates rapidly while increasing distance from mobile phone. Using the phone in areas of good reception decreases exposure as it allows the phone to communicate at low power. A large number of studies have been performed over the last two decades to assess whether mobile phones create a potential health risk [1, 2]. To date, no adverse health effects have been established for mobile phone use [3]. Investigations of effects of mobile phones, other devices emitting EM waves on human health and measures against them have been still continued. As the results were evaluated, 1°C temperature increase of tissue cannot be removed in the circulatory system and this damages tissue. Limits for each frequency band were specified by the relevant institutions according to this criterion. Limits for the general public in the whole body average SAR value and in localized SAR value are 0.08 and 2 W/kg at 10 MHz–10 GHz frequency band, respectively [4]. SAR (W/kg) is the amount of the power absorbed by unit weight tissue. Measuring of SAR values in living cells is not experimentally possible. Specifically created model (phantom) and specialized laboratory test equipment are used for this. SAR values can be measured experimentally by placing probe into the phantom. The equipment consists of a phantom (human or box), precision robot, RF field sensors, and mobile phone holder, as shown in Fig. 5.2. The phantom is filled with a liquid that approximately represents the electrical properties of human tissue. Determination of SAR values can also be carried out with numerical calculations as an alternative to using the phantom [5–8]. In this case, the calculations are executed with simulations using electrical properties and physical dimensions of the typical human head. Mobile phones are manufactured within the limited SAR values. However, negative consequences may be seen in time due to placing them close to head during calling and due to long phone calls. In this case, it may be required to use with some precautions. For instance, a headset-microphone set can be used while calling. Alternatively, the attenuation of EM waves emitted from mobile phone towards user’s head by using the conductive material can be provided. Conductive material

5 SAR Values in a Homogenous Human Head Model

55

Fig. 5.2 Experimentally measuring SAR value with a phantom

mostly reflects the EM waves back. Hence, the amount of absorption of EM waves will be reduced to minimum level by placing the suitable sized conductive plate between the mobile phone’s antenna and the user head. To reduce SAR, some studies having different techniques has been introduced, too [9, 10]. In this chapter, 2D-FDTD technique, absorbing boundary conditions, and SAR calculation method are expressed. Additionally, a numerical application is presented. In the application, 2D simulations have been conducted to investigate reducing of SAR values in user head using copper plate. Simulations have been carried out in Matlab programming language using the 2D-FDTD method. First order Mur’s boundary condition have been used to remove artificial reflections naturally occurred in FDTD method.

5.2 2D-FDTD Method When Maxwell’s differential equations are considered, it can be seen that the change in the E-field in time is dependent on the change in the H-field across space. This results in the basic FDTD time-stepping relation that, at any point in space, the updated value of the E-field in time is dependent on the stored value of the E-field and the numerical curl of the local distribution of the H-field in space. Similar situation with above is present for the H-field. Iterating the E-field and H-field updates results in a marching-in-time process wherein sampled-data analogs of the continuous EM waves under consideration propagate in a numerical grid stored in the computer memory. Yee proposed that the vector components of the E-field and H-field spatially stagger about rectangular unit cells of a cartesian computational grid so that each E-field vector component is located midway between a pair of H-field vector components, and conversely [11, 12]. This scheme, now known as a Yee lattice, constructs the core of many FDTD software. The choices of grid cell size and time step size are very important in applying FDTD. Cell size must be small enough to permit accurate results at the highest operating frequency, and also be large enough to keep computer requirements manageable.

56

L. Seyfi and E. Yaldız

Cell size is directly affected by the materials present. The greater the permittivity or conductivity, the shorter the wavelength at a given frequency and the smaller the cell size required. The cell size must be much less than the smallest wavelength for which accurate results are desired. An often used cell size is k=10 or less at the highest frequency. For some situations, such as a very accurate determination of radar scattering cross-sections, k=20 or smaller cells may be necessary. On the other hand, good results are obtained with as few as four cells per wavelength. If the cell size is made much smaller than the Nyquist sampling limit, k ¼ 2Dx, is approached too closely for reasonable results to be obtained and significant aliasing is possible for signal components above the Nyquist limit. Once the cell size is selected, the maximum time step is determined by the Courant stability condition. Smaller time steps are permissible, but do not generally result in computational accuracy improvements except in special cases. A larger time step results in instability. To understand the basis for the Courant condition, consider a plane wave propagating through an FDTD grid. In one time step, any point on this wave must not pass through more than one cell, because during one time step, FDTD can propagate the wave only from one cell to its nearest neighbors. To determine this time step constraint, a plane wave direction is considered so that the plane wave propagates most rapidly between field point locations. This direction will be perpendicular to the lattice planes of the FDTD grid. For a grid of dimension d (where d = 1, 2, or 3), with all cell sides equal to Du, it is found that with v the maximum velocity of propagation in any medium in the problem, usually the speed of light in free space [13], Du vDt pﬃﬃﬃ d

ð5:1aÞ

for stability. If the cell sizes are not equal, it is as following for a 2-D and 3-D rectangular grid, respectively [14, 15]. sﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 1 1 þ vDt 1= ðDxÞ2 ðDyÞ2

ð5:1bÞ

sﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 1 1 1 þ þ vDt 1= 2 2 ðDxÞ ðDyÞ ðDzÞ2

ð5:1cÞ

where Dt is temporal increment and Dx; Dy; Dz, denoting sides of the cubic cell are spatial increments in the x, y, and z-direction, respectively. Firstly, although the real world is obviously 3D, many useful problems can be solved in two dimensions when one of the dimensions is much longer than the other two. In this case, it is generally assumed that the field solution does not vary in this dimension, which allows us to simplify the analysis greatly. In electromagnetics, this assumption permits us to decouple the Maxwell equations into two sets of fields or modes, and they are often called as: transverse magnetic and

5 SAR Values in a Homogenous Human Head Model

57

transverse electric. Any field subject to the assumption of no variation in z can be written as the sum of these modes: Transverse magnetic modes (TMz), contain the following field components: Ez(x, y, t), Hx(x, y, t) and Hy(x, y, t). Transverse electric modes (TEz), contain the following field components: Hz(x, y, t), Ex(x, y, t) and Ey(x, y, t). 2D TM mode is [16] oHx 1 oEz ¼ q 0 Hx ot oy l oHy 1 oEz 0 ¼ q Hy ot l ox oEz 1 oHy oHx ¼ rEz ot oy e ox

ð5:2aÞ ð5:2bÞ ð5:2cÞ

2D TE mode is oEx 1 oHz ¼ rEx ot e oy oEy 1 oHz ¼ rEy ot e oy oHZ 1 oEx oEy ¼ q0 Hz ot ox l oy

ð5:3aÞ ð5:3bÞ ð5:3cÞ

where l, q0 , e, and r are permeability, equivalent magnetic resistivity, permittivity, and conductivity, respectively. TM and TE modes are decoupled, namely, they contain no common field vector components. In fact, these modes are completely independent for structures comprised of isotropic materials. That is, the modes can exist simultaneously with no mutual interactions. Problems having both TM and TE excitation can be solved by a superposition of these two separate problems [14]. When 2D TM mode is discretized, FDTD formulas are nþ1=2 n1=2 n n Ez;i;jþ1 Hx;i;jþ1=2 ¼ Da Hx;i;jþ1=2 þ Db Ez;i;j

ð5:4aÞ

nþ1=2 n1=2 n n Ez;i;j Hy;iþ1=2;j ¼ Da Hy;iþ1=2;j þ Db Ez;iþ1;j

ð5:4bÞ

nþ1=2 nþ1=2 nþ1=2 nþ1=2 nþ1 n ¼ Ca Ez;i;j þ Cb Hy;iþ1=2;j Hy;i1=2;j þ Hx;i;j1=2 Hx;i;jþ1=2 ð5:4cÞ Ez;i;j

58

L. Seyfi and E. Yaldız

ð2 e r DtÞ ð2 e þ r DtÞ

ð5:5aÞ

ð2 DtÞ Dx ð2 e þ r DtÞ

ð5:5bÞ

ð2 l r DtÞ ð2 l þ r DtÞ

ð5:5cÞ

ð2 DtÞ Dxð2 l þ r DtÞ

ð5:5dÞ

Ca ¼ Cb ¼

Da ¼ Db ¼

where n denotes discrete time. In a programming language, there is no location like n ? 1/2. So these subscripts can be rounded to upper integer value [17], as followings. nþ1 n n n ¼ Da Hx;i;jþ1 þ Db Ez;i;j Ez;i;jþ1 Hx;i;jþ1

ð5:6aÞ

n1=2 nþ1 n n ¼ Da Hy;iþ1;j þ Db Ez;iþ1;j Ez;i;j Hy;iþ1;j

ð5:6bÞ

nþ1 n nþ1 nþ1 nþ1 nþ1 ¼ Ca Ez;i;j þ Cb Hy;iþ1;j Hy;i;j þ Hx;i;j Hx;i;jþ1 Ez;i;j

ð5:6cÞ

5.2.1 Perfectly Matched Layer ABC Perfect matched layer (PML) ABC is an absorbing material boundary condition which is firstly proposed by J.P. Berenger. PML is proven very effective, reflectionless to all impinging waves (polarization, angles), and is also reflectionless over a broad-band. According to Berenger PML technique, the computational area is surrounded by PML. The EM energy is absorbed rapidly in these layers so that perfect conductor can be set at the outmost. This can be also understood as that the interior area is matched to desired properties by the PML (Fig. 5.3). For TMz wave, Ez is split into Ezx and Ezy. And Faraday’s Law and Ampere’s Law break into four equations: oEzx oHy e þ rx Ezx ¼ ot ox

ð5:7aÞ

oEzy oHx þ ry Ezy ¼ ot oy

ð5:7bÞ

e

5 SAR Values in a Homogenous Human Head Model

59

PML(0,0,σy2, σ*y2) PML(σx1, σ*xv , σy2, σ*y2)

Fig. 5.3 The PML technique

PML(σx1, σ*x1,0,0)

PML(σx2, σ*x2, σy2, σ*y2)

PML(σx2,σ*x2,0,0) Wave Source

Vacuum PML(σx1, σ*x1, σy1,σ*y2) PML(σx2, σ*x2, σy1, σ*y2) PML(0,0, σy1, σ*y1) Perfect conductor

l

oHx oðEzx þ Ezy Þ þ ry Hx ¼ ot oy

ð5:7cÞ

oHy oðEzx þ Ezy Þ þ rx Hy ¼ ot ox

ð5:7dÞ

l

where r is equivalent magnetic conductivity. In PML area, the finite different equation is [18]:

Hxnþ1 ði þ 1=2; jÞ ¼ ery ðiþ1=2;jÞdt=l Hxn ði þ 1=2; jÞ

ð1 ery ðiþ1=2;jÞdt=lÞ ry ði þ 1=2; jÞd " nþ1=2 # nþ1=2 ði þ 1=2; j þ 1=2Þ Ezx ði þ 1=2; j þ 1=2Þ þ Ezy nþ1=2 nþ1=2 ði þ 1=2; j 1=2Þ Ezy ði þ 1=2; j 1=2 Ezx ð5:8aÞ

Hynþ1 ði; j þ 1=2Þ ¼ erx ði;jþ1=2Þdt=l Hyn ði; j þ 1=2Þ 1 erx ði:jþ1=2Þdt=l r ði; j þ 1=2Þd " xnþ1=2 # nþ1=2 ði 1=2; j þ 1=2Þ Ezx ði 1=2; j þ 1=2Þ þ Ezy : nþ1=2 nþ1=2 ði þ 1=2; j þ 1=2Þ Ezy ði þ 1=2; j þ 1=2Þ Ezx ð5:8bÞ nþ1=2 n1=2 Ezx ði þ 1=2; j þ 1=2Þ ¼ erx ðiþ1=2;jþ1=2Þdt=e Ezx ði þ 1=2; j þ 1=2Þ

ð1 erx ðiþ1=2;jþ1=2Þdt=e Þ rx ði þ 1=2; j þ 1=2Þd h i Hyn ði; j þ 1=2Þ Hyn ði þ 1; j þ 1=2Þ

ð5:8cÞ

60

L. Seyfi and E. Yaldız

nþ1=2 n1=2 Ezy ði þ 1=2; j þ 1=2Þ ¼ ery ðiþ1=2;jþ1=2Þdt=e Ezy ði þ 1=2; j þ 1=2Þ

ð1 ery ðiþ1=2;jþ1=2Þdt=e ry ði þ 1=2; j þ 1=2Þd

Hxn ði þ 1=2; j þ 1Þ Hxn ði þ 1=2; jÞ

ð5:8dÞ

In PML, the magnetic and electric conductivity is matched so that there is not any reflection between layers. The wave impedance matching condition is [19] r r ¼ e l

ð5:9Þ

5.2.2 Mur’s Absorbing Boundary Conditions Spurious wave reflections occur at the boundaries of computational domain due to nature of FDTD code. Virtual absorbing boundaries must be used to prevent the reflections there. Many Absorbing boundary conditions (ABCs) have been developed over the past several decades. Mur’s ABC is one of the most common ABCs. There are two types of Mur’s ABC to estimate the fields on the boundary, which are first-order and second-order accurate. Mur’s ABCs provide better absorption with fewer cells required between the object and the outer boundary, but at the expense of added complexity. The Mur’s absorbing boundaries are adequate and relatively simple to apply [13]. FDTD simulations have been carried out in two dimensions with first order Mur’s absorbing boundary conditions, therefore, they did not require a super computer system to perform. Considering the Ez component located at x = iDx, y = jDy for 2D case, the first order Mur’s estimation of Ez field component on the boundary is [17, 20] cDt Dx nþ1 nþ1 n n Ei1;j Ei;j ð5:10Þ ¼ Ei1;j þ Ei;j cDt þ Dx

5.3 Developed Program A program was developed in the Matlab programming language to examine the propagation of mobile phone radiation [21, 22]. Representation of the area analyzed in the program is shown in Fig. 5.4. Flow chart of the program is shown in Fig. 5.5. As shown in Fig. 5.5 firstly required input parameters of the program is entered by the user, and the area of analysis is divided into cells, and matrices are created for the electric and magnetic field components (E, H) calculated at each

5 SAR Values in a Homogenous Human Head Model Fig. 5.4 Representation of the 2D simulation area

61

yend

y2 y0,yT

y1

0 x0

x1 x2

xT

xH

xend

Fig. 5.5 Flow diagram for developed program

time step and each cell. Then, mathematical function of electric field emitted by mobile phone antenna is entered. Mur’s absorbing boundary conditions are applied to eliminate artificial reflections and loops are carried out to calculate the electric and magnetic field values by stepping in the position and the time in the part that can be called FDTD Cycle.

62

L. Seyfi and E. Yaldız

Fig. 5.6 Graphical interface of the developed program

The maximum electric field value is recorded at test point (T) for 1 or 10 g SAR. SAR values are calculated using the formula in Eq. 5.11 for each cell [23], and then 1 or 10 g averaged SAR is obtained by taking the average of them. SAR ¼

rjET j2 2q

ðW=kgÞ

ð5:11Þ

Here, r is average conductivity of the head, q is average mass density. ET is the maximum electric field calculated for the test point. A graphical interface has been designed for the developed program. This interface is shown in Fig. 5.6. All required data are entered here, and then the program is executed with the START button.

5.3.1 Input Parameters Simulations were performed for unshielded case by entering the electrical properties of free space in Shield Features part in the developed program’s graphical user interface and for shielded case by entering the electrical properties of copper (r = 5.8 9 107 S/m, 30 9 2 mm sized), separately. SAR was calculated at 8 cells

5 SAR Values in a Homogenous Human Head Model Table 5.1 Obtained SAR values from simulation results and shielding effectiveness values

1g 10 g

63

SAR (W/kg) without shield

SAR (W/kg) with copper shield

SE (dB)

0.7079 0.5958

0.0061 0.0060

-41.3 -39.9

for 1 g in the vicinity of test point, 80 cells for 10 g SAR. Output power of radiation source was assumed as constant during simulations. Average electrical conductivity of head in which SAR values were calculated was assumed as 0.97 S/m, the average mass density 1000 kg/m3, relative permittivity 41.5, and the diameter 180 mm at 900 MHz [24, 25]. Time increment and space increment parameters of FDTD simulations were selected as 2 ps and 1 mm, respectively.

5.3.2 Simulation Results 1 g and 10 g averaged SAR values were calculated for both of cases as given in Table 5.1. Shielding effectiveness (SE) was calculated using the obtained values from simulation results with Eq. 5.12. SE ¼ 20 log

s1 s2

ðdBÞ

ð5:12Þ

Here, S1 is the SAR value in shielded case, S2 is one in unshielded case. As shown in Table 5.1, SAR value decreased from 0.7079 W/kg to 0.0061 W/kg for 1 g averaged case and from 0.5958 to 0.0060 W/kg for 10 g averaged case under the effect of copper shield.

5.4 Conclusion In this chapter, some information about mobile phones, their possible health risks, the parameter of SAR, its calculation and experimental measurement method, and numerical computing technique (2D-FDTD method) are expressed. In the application given in this chapter, reduction of radiation towards user from mobile phone with copper shield at 900 MHz frequency was investigated by calculating the SAR values in some simulations. The reason for choosing 2D and Mur’s boundary condition in the simulation is to keep computer memory and processor requirements at minimum level. 1 and 10 g averaged SAR values were separately computed. In the simulations, shielding effectiveness was calculated using estimated SAR values for shielded and unshielded conditions. As a result of simulations, it was found that the SAR values affecting mobile phone user were reduced about 40 dB by using copper shield. Acknowledgments This work was supported by scientific research projects (BAP) coordinating office of Selçuk University.

64

L. Seyfi and E. Yaldız

References 1. Health Projection Agency [Online] Available: http://www.hpa.org.uk/Topics/Radiation/ UnderstandingRadiation/UnderstandingRadiationTopics/ElectromagneticFields/MobilePhones/ info_HealthAdvice/ 2. Australian Radiation Protection and Nuclear Safety Agency [Online] Available: http://www. arpansa.gov.au/mobilephones/index.cfm 3. World Health Organization [Online] Available: http://www.who.int/mediacentre/factsheets/ fs193/en/index.html 4. Ahlbom A, Bergqvist U, Bernhardt JH, Ce´sarini JP, Court LA (1998) Guidelines for limiting exposure to time-varying electric, magnetic, and electromagnetic fields (up to 300 GHz). Health Phys Soc 74(4):494–522 5. Kua L-C, Chuang H-R (2003) FDTD computation of fat layer effects on the SAR distribution in a multilayered superquadric-ellipsoidal head-model irradiated by a dipole antenna at 900/ 1800 MHz. In: IEEE International Symposium on Electromagnetic Compatibility 6. Kuo L-C, Lin C-C, Chunng H-R (2004) FDTD computation offat layer effects on SAR distribution in a multilayered superquadric-ellipsoidal head model and MRI-based heads proximate to a dipole antenna. Radio Science Conference, Proceedings. Asia-Pacific, August 2004 7. Chen H-Y, Wang H-H (1994) Current and SAR induced in a human head model by the electromagnetic fields irradiated from a cellular phone. IEEE Trans Microwave Theory Tech 42(12):2249–2254 8. Schiavoni A, Bertotto P, Richiardi G, Bielli P (2000) SAR generated by commercial cellular phones-phone modeling, head modeling, and measurements. IEEE Trans Microwave Theory Tech 48(11):2064–2071 9. Kusuma AH, Sheta A-F, Elshafiey I, Alkanhal M, Aldosari S, Alshebeili SA (2010) Low SAR antenna design for modern wireless mobile terminals. In: STS International Conference, January 2010 10. Islam MT, Faruque MRI, Misran N (2009) Reduction of specific absorption rate (SAR) in the human head with ferrite material and metamaterial. Prog Electromagn Res C 9:47–58 11. Yee K (1966) Numerical solution of initial boundary value problems involving Maxwell’s equations in isotropic media. IEEE Trans Antennas Propag 14:302–307 12. Taflove A, Brodwin ME (1975) Numerical solution of steady-state electromagnetic scattering problems using the time-dependent Maxwell’s equations. IEEE Trans Microwave Theory Tech 23:623–630 13. Kunz KS, Luebbers RJ (1993) The finite difference time domain method for electromagnetism. CRC Press, Boca Raton 14. Stutzman WL, Thiele GA (1998) Antenna theory. Wiley, New York 15. Isaacson E, Keller HB (1967) Analysis of numerical methods. Wiley, New York 16. Davidson DB (2005) Computational electromagnetics for RF and microwave engineering. Cambridge University Press, Cambridge 17. Seyfi L, Yaldiz E (2006) Shielding analysis of mobile phone radiation with good conductors. In: Proceedings of the International Conference on Modeling and Simulation, vol 1, pp 189–194 18. Sadiku MNO (2000) Numerical techniques in electromagnetic, 2nd edn. CRC Press, Boca Raton 19. Berenger JP (1994) Matched layer for the absorption of electromagnetic waves. J Comp Phys 114:185–200 Aug 20. Mur G (1981) Absorbing boundary conditions for the finite-difference approximation of the time-domain electromagnetic field equations. IEEE Trans Electromag Compat 23:377–382 21. Seyfi L, Yaldız E (2010) Numerical computing of reduction of SAR values in a homogenous head model using copper shield. Lecture notes in engineering and computer science: Proceedings of the World Congress on Engineering 2010, WCE 2010, 30 June–2 July 2010, London, UK, pp 839–843

5 SAR Values in a Homogenous Human Head Model

65

22. Seyfi L, Yaldız E (2008) Simulation of reductions in radiation from cellular phones towards their users. In: First International Conference on Simulation Tools and Techniques for Communications. Networks and Systems, Marseille, France, March 2008 23. Foster KR, Chang K (eds) (2005) Encyclopedia of RF and microwave engineering, vol 1. Wiley-Interscience, Hoboken 24. Moustafa J, Abd-Alhameed RA, Vaul JA, Excell PS, McEwen NJ (2001) Investigations of reduced SAR personal communications handset using FDTD. In: Eleventh International Conference on Antennas and Propagation (IEE Conf Publ No 480), 17–20 April, pp 11–15 25. FCC, OET Bulletin 65c (2001) Evaluating compliance with FCC Guidelines for human exposure to radio frequency electromagnetic fields, [Online] Available: http://www.fcc.gov/ Bureaus/Engineering_Technology/Documents/bulletins/oet65/oet65c.pdf

Chapter 6

Mitigation of Magnetic Field Under Overhead Transmission Line Adel Zein El Dein Mohammed Moussa

Abstract The chapter presents an efficient way to mitigate the magnetic field resulting from the three-phase 500 kV single circuit overhead transmission line existing in Egypt, by using a passive loop conductor. The aim of this chapter is to reduce the amount of land required as right-of-way. The chapter used an accurate method for the evaluation of 50 Hz magnetic field produced by overhead transmission lines. This method is based on the matrix formalism of multiconductor transmission lines (MTL). This method obtained a correct evaluation of all the currents flowing in the MTL structure, including the currents in the subconductors of each phase bundle, the currents in the ground wires, the currents in the mitigation loop, and also the earth return currents. Furthermore, the analysis also incorporates the effect of the conductors sag between towers, and the effect of sag variation with the temperature on the calculated magnetic field. Good results have been obtained and passive loop conductor design parameters have been recommended for this system at ambient temperature (35°C).

6.1 Introduction The rabid increase in HV transmission lines and irregular population areas near the manmade sources of electrical and magnetic fields, in Egypt, needs a suggestion of methods to minimize or eliminate the effect of magnetic and electrical fields on human begins in Egyptian environmental areas especially in irregular areas.

A. Z. El Dein Mohammed Moussa (&) Department of Electrical Engineering, High Institute of Energy, South Valley University, Aswan, 81258, Egypt e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_6, Ó Springer Science+Business Media B.V. 2011

67

68

A. Z. El Dein Mohammed Moussa

Public concern about magnetic field effects on human safety has triggered a wealth of research efforts focused on the evaluation of magnetic fields produced by power lines [1–4]. Studies include the design of new compact transmission line configurations; the inclusion of auxiliary single or double lops for magnetic field mitigation in already existing power lines; the consideration of series-capacitor compensation schemes for enhancing magnetic field mitigation; the reconfiguration of lines to high phase operation, etc. [5–7]. However, many of the studies presented that deal with power lines make use of certain simplifying assumptions that, inevitably, give rise to inaccurate results in the computed magnetic fields. Ordinary simplifications include neglecting the earth currents, neglecting the ground wires, replacing bundle phase conductors with equivalent single conductors, and replacing actual sagged conductors with average height horizontal conductors. These assumptions result in a model where magnetic fields are distorted from those produced in reality [8, 9]. In this chapter, a matrix-based MTL model [10], where the effects of earth currents, ground wire currents and mitigation loop current are taken into account, is used; moreover, actual bundle conductors and conductors’ sag at various temperatures are taken into consideration. The results from this method without mitigation loop are compared with those produced from the common practice method [8, 9] for magnetic field calculation where the power transmission lines are straight horizontal wires of infinite length, parallel to a flat ground and parallel with each other. Then the optimal design parameters of the mitigation loop for system under study are obtained.

6.2 Computation of System Currents The MTL technique is used in this chapter for the simple purpose of deriving the relationship among the line currents of an overhead power line. This method is explained in [10], this chapter reviews and extends this method for Egyptian 500 kV overhead transmission line, with an other formula for the conductors’ sag, taken into account the effect of temperature on the sag configuration [11]. The first step required to conduct a correct analysis consists in determination of all system currents based on prescribed phase-conductor currents Ip : Ip ¼ ½I1 ; I2 ; I3

ð6:1Þ

Consider the frequency-domain transmission line matrix equations for nonuniform MTLs (allowing the inclusion of the sag effect)

dV ¼ Z 0 ðx; zÞI dz

ð6:2aÞ

dI ¼ Y 0 ðx; zÞV dz

ð6:2bÞ

6 Mitigation of Magnetic Field Under Overhead Transmission Line

69

Where Z0 and Y0 , denote the per-unit-length series-impedance and shuntadmittance matrices, respectively, V and I are complex column matrices collecting the phasors associated with all of the voltages and currents of the line conductors, respectively. 3 ½Va 1np 7 6 ½V 6 b 1np 7 7 6 V ¼ 6 ½Vc 1np 7 7 6 4 ½VG 1nG 5 ½VL 1nL 2

3 ½Ia 1np 7 6 ½I 6 b 1np 7 7 6 and I ¼ 6 ½Ic 1np 7 7 6 4 ½IG 1nG 5 ½IL 1nL 2

ð6:3Þ

In (6.3), subscripts a, b, and c refer to the partition of phase bundles into three sub-conductor sets. Subscript G refers to ground wires and L subscript refers to the mitigation loop. In (6.3) np, nG, and nL denote, the number of phase bundles, the number of ground wires, and the number of conductors in the mitigation loop, respectively, for the Egyptian 500 kV overhead transmission line it is seen that: np = 3, nG = 2, and nL = 2 as it is proposed in this chapter. Since the separation of the electric and magnetic effects is an adequate approach for quasistationary regimes (50 Hz), where wave-propagation phenomena are negligible, all system currents are assumed to be Z independent. This means the transversal displacement currents among conductors are negligible or, in other words, (6.2b) equates to zero and only Z0 values are needed to calculate. Since the standard procedure for computing Z0 in (6.2a) has been established elsewhere [12–14], details will not be revealed here and thus only a brief summary is presented. Z 0 ¼ jxL þ ZE þ Zskin

ð6:4Þ

The external-inductance matrix is a frequency-independent real symmetric matrix whose entries are: Lkk ¼

lo 2yk ln 2p rk

ð6:5aÞ

Lkk ¼

lo ðyi þ yk Þ2 þ ðxi þ xk Þ2 ln 4p ðyi yk Þ2 þ ðxi þ xk Þ2

ð6:5bÞ

Where rk denotes conductor radius, and yk and xk denote the vertical and horizontal coordinates of conductor k. Matrix ZE, the earth impedance correction, is a frequency dependent complex matrix whose entries can be determine using Carson’s theory or, alternatively, the Dubanton complex ground plane approach [12–14]. The entries of ZE are defined as: ðZE Þkk ¼ jx

lo P ln 1 þ 2p yk

ð6:6aÞ

70

A. Z. El Dein Mohammed Moussa

Fig. 6.1 Linear dimensions which determine parameters of the catenary

2 þ ðxi xk Þ2 l ðyi þ yk þ 2PÞ ðZE Þik ¼ jx o ln 2 4p ðyi yk Þ þ ðxi xk Þ2

! ð6:6bÞ

the complex depth, is given by P ¼ ðjxlo =qÞ1=2 with q denoting the where P, earth resistivity. Matrix Zskin is a frequency-dependent complex diagonal matrix whose entries can be determined by using the skin-effect theory results for cylindrical conductors [9]. For low-frequency situations, it will be: ðZskin Þkk ¼ ðRdc Þk þ jx

lo 8p

ð6:7Þ

Where ðRdc Þk denotes the per-unit-length dc resistance of conductor k. Due to the line conductors’ sag between towers; yk will be a function on the distance z between the two towers, also the entries for L and ZE, defined in (6.5a, b) and (6.6a, b), vary along the longitudinal coordinate z. The exact shape of a conductor suspended between two towers of equal height can be described by such parameters; as the distance between the points of suspension span, d, the sag of the conductor, S, the height of the lowest point above the ground, h, and the height of the highest point above the ground, hm. These parameters can be used in different combinations [13, 14]. Figure 6.1 depicts the basic catenary geometry for a singleconductor line, this geometry is described by: yk ¼ hk þ 2ak sinh

2

z 2ak

ð6:8Þ

Where ak is the solution of the transcendental equation: 2½ðhmk hk Þ=dk uk ¼ sinh2 ðuk Þ, for conductor k; with uk ¼ dk =ð4ak Þ. The parameter ak is also associated with the mechanical parameters of the line: ak ¼ ðTh Þk =wk where ðTh Þk is the conductor tension at mid-span and wk is weight per unit length of the conductor k. Consider a mitigation loop of length l, is present, where l is a multiple of the span length d. The line section under analysis has its near end at -l/2 and its far end at l/2. The integration of (6.2a) from z = -l/2 to z = l/2 gives:

6 Mitigation of Magnetic Field Under Overhead Transmission Line

Zl=2

Vnear Vfar ¼ I

71

Z 0 ðzÞdz

ð6:9aÞ

l=2

Equation 6.9a can be written explicitly, in partitioned form, as: 2

3

2 Zaa 6 7 6 DVb 7 6 Zba 6 7 6 6 DVc 7 ¼ 6 Zca 6 7 6 6 7 4 ZGa 4 DVG 5 ZLa DVL DVa

Zab Zbb Zcb ZGb ZLb

Zac Zbc Zcc ZGc ZLc

ZaG ZbG ZcG ZGG ZLG

32 3 ZaL Ia 6 Ib 7 ZbL 7 76 7 6 7 ZcL 7 76 Ic 7 5 ZGL 4 IG 5 ZGL IL

ð6:9bÞ

The computation of the bus impedance Z in Eq. 6.9a, b is performed using the following formula: Zl=2

Z¼

Z 0 ðzÞdz

ð6:10Þ

l=2

Where values for Z0 are evaluated from Eqs. 6.4–6.7 considering the conductors’ heights given by (6.8). The two-conductor mitigation loop is closed and may include or not a series capacitor of impedance Zc [7]. In any case, the submatrix IL in (6.3) has the form:

I IL ¼ L1 IL2

¼ I L ST

ð6:11Þ

where S ¼ ½ 1 1 . By using the boundary conditions at both the near and far end of the line section, the voltage drop in the mitigation loop will be:

DVL1 DVL ¼ DVL2

VL1 ¼ VL2

V L1 V L2 near

DVL1 ¼ DV L1 þ Zc I L far

which can be written as: SDVL ¼ DVL1 DVL2 ¼ Zc I L

ð6:12Þ

Where I L is the loop current, and Zc ¼ jXs ¼ 1=ðjxCs Þ is the impedance of the series capacitor included in the loop. Using (6.12), the fifth equation contained in (6.9b) allows for the evaluation of the currents flowing in the mitigation loop. Using Eq. 6.12, the fifth equation contained in Eq. 6.9b allows for the evaluation of the currents flowing in the mitigation loop.

72

A. Z. El Dein Mohammed Moussa

IL ¼

IL I L

¼ YST SZLa Ia YST SZLb Ib YST SZLc Ic YST SZLG IG |ﬄﬄﬄﬄ{zﬄﬄﬄﬄ} |ﬄﬄﬄﬄ{zﬄﬄﬄﬄ} |ﬄﬄﬄﬄ{zﬄﬄﬄﬄ} |ﬄﬄﬄﬄﬄ{zﬄﬄﬄﬄﬄ} KLa

KLb

KLc

ð6:13Þ

KLG

Where; Y ¼ ðZc þSZ1 LL ST Þ. Taking into account that the conductors belonging to given phase bundle are bonded to each other, and that ground wires are bonded to earth (tower resistances neglected), that result in: DVa ¼ DVb ¼ DVc and DVG ¼ 0. By using DVG ¼ 0 in the fourth equation contained in (6.9b) and using Eq. 6.13, the ground wire will be: IG ¼ YG ðZGa ZGL KLa Þ Ia þ YG ðZGb ZGL KLb Þ Ib þ YG ðZGc ZGL KLc Þ Ic |ﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄ{zﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄ} |ﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄ{zﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄ} |ﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄ{zﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄ} KGa

KGb

KGc

ð6:14Þ Where; YG ¼ ðZGL KLG ZGG Þ1 . Next, by using (6.13) and (6.14), IL and IG can be eliminated in (6.9b), yielding a reduced-order matrix problem 2

3 2 DVa Z^aa 4 DVb 5 ¼ 4 Z^ba DVc Z^ca

^ab Z ^bb Z Z^cb

32 3 Ia Z^ab Z^bc 54 Ib 5 Ic Z^cc

ð6:15Þ

Where; Z^aa ¼ Zaa þ ZaG KGa ZaL ðKLa þ KLG KGa Þ Z^ab ¼ Zab þ ZaG KGb ZaL ðKLb þ KLG KGb Þ Z^ac ¼ Zac þ ZaG KGc ZaL ðKLc þ KLG KGc Þ Z^ba ¼ Zba þ ZbG KGa ZbL ðKLa þ KLG KGa Þ Z^bb ¼ Zbb þ ZbG KGb ZbL ðKLb þ KLG KGb Þ Z^bc ¼ Zbc þ ZbG KGc ZbL ðKLc þ KLG KGc Þ Z^ca ¼ Zca þ ZcG KGa ZcL ðKLa þ KLG KGa Þ Z^cb ¼ Zcb þ ZcG KGb ZcL ðKLb þ KLG KGb Þ Z^cc ¼ Zcc þ ZcG KGc ZcL ðKLc þ KLG KGc Þ The relationship between Ia, Ib and Ic is obtained from (6.15) by making DVa ¼ DVb ¼ DVc and by using Ia þ Ib þ Ic ¼ Ip . Then the following relations are obtained by: Ia ¼ KKac ðKKac þ Kbc þ 1Þ1 Ip

ð6:16aÞ

Ib ¼ KKbc ðKKac þ Kbc þ 1Þ1 Ip

ð6:16bÞ

Ic ¼ ðKKac þ Kbc þ 1Þ1 Ip

ð6:16cÞ

6 Mitigation of Magnetic Field Under Overhead Transmission Line Table 6.1 Temperature effect

Temperature (°C) Sag

15 7.3

20 7.8

73

25 8.3

30 8.8

35 9.3

40 9.8

45 10.3

Where; KKac ¼ KKab KKbc þ KKac ; Kab ¼ Ya Z^bb Z^ab ; Kac ¼ Ya Z^bc Z^ac ; 1 Ya ¼ Z^aa Z^ba ; Kbc ¼ ðKbc1 Þ1 Kcb1 ; Kbc1 ¼ Z^ca Kab þ Z^cb Z^ba Kab Z^bb ; and Kcb1 ¼ Z^ba Kac þ Z^bc Z^ca Kac Z^cc : Once IP is given, all of the overhead conductor currents Ia, Ib, Ic, IG and IL can be evaluated, step after step using (6.13), (6.14), and (6.16a–c). The net current returning through the earth IE is the complement of the sum of all overhead conductor currents. " n # np np nG p nL X X X X X ðIa Þk þ ðIb Þk þ ðIc Þk þ ðIG Þk þ ðIL Þk IE ¼ ð6:17Þ k¼1

k¼1

k¼1

k¼1

k¼1

The sag of each conductor depends on individual characteristics of the line and environmental conditions. By using the Overhead Cable Sag Calculation Program [15], the variation of sag with temperatures can be calculated as in Table 6.1. Once all system currents are calculated, the magnetic field at any point, which produced from these currents, can be calculated.

6.3 Magnetic Field Calculations By using the Integration Technique, which explained in details in [16] and reviewed here, the magnetic field produced by a multiphase conductors (M), and their images, in support structures at any point P(xo, yo, zo) can be obtained by using the Biot–Savart law as [9, 16]: Zd=2 M X N

1 X ðHx Þk~ ax þ ðHy Þk~ ay þ ðHz Þk~ az dz: Ho ¼ 4p K¼1 n¼N

ð6:18Þ

d=2

h

ðHx Þk ¼

Ik ðz zo þ ndÞ sinh

z ak

ðyk yo Þ

i

dk h i Ik ðz zo þ ndÞ sinh azk ðyk þ yo þ 2PÞ dk0 ðHy Þk ¼

Ik ðxk xo Þ Ik ðxk xo Þ dk dk0

ð6:19Þ ð6:20Þ

74

A. Z. El Dein Mohammed Moussa

ðHz Þk ¼

Ik ðxk xo Þ sinh dk

z ak

þ

Ik ðxk xo Þ sinh

z ak

dk0

ð6:21Þ

h i3=2 dk ¼ ðxk xo Þ2 þ ðyk yo Þ2 þ ðz zo þ ndÞ2

ð6:22Þ

h i3=2 2 þ ðz zo þ ndÞ2 dk0 ¼ ðxk xo Þ2 þ ðyk þ yo þ 2PÞ

ð6:23Þ

The parameter (N) in (6.18) represents the number of spans to the right and to the left from the generic one where K = 0 as shown in Fig. 6.1.

6.4 Results and Discussion The data used in the calculation of the magnetic field intensity at points one meter above ground level (field points), under Egyptian 500 kV TL single circuit are as presented in Table 6.2. The phase-conductor currents are defined by a balanced direct-sequence three-phase set of 50 Hz sinusoidal currents, with 2-kA rms, that is: Ip ¼ 2½1; expðj2p=3Þ; expðj2p=3Þ kA

ð6:24Þ

Figure 6.2 shows the effect of the number of spans (N) on the calculated magnetic field intensity. It is noticed that, when the magnetic field intensity calculated at point P1 (Fig. 6.1) and a distance away from the center phase, the effect of the spans’ number is very small due to the symmetry of the spans around the calculation points, as explained in Fig. 6.1, where the contributions of the catenaries d1 and d2 are equal and smaller than the contribution of the catenary d, as Table 6.2 Characteristics of 500 kV line conductors Conductor number Radius (mm) X-Coordinate (m) Y-Coordinate (m) Rdc at 20°C (X/km) 1a 1b 1c 2a 2b 2c 3a 3b 3c G1 G2 L1 L2

15.3 15.3 15.3 15.3 15.3 15.3 15.3 15.3 15.3 5.6 5.6 11.2 11.2

-13.425 -12.975 -13.2 -0.225 0.225 0 12.975 13.425 13.2 -8 8 -13.2 13.2

22.13 22.13 21.74 24.48 24.48 24.09 22.13 22.13 21.74 30 30 17 17

0.0511 0.0511 0.0511 0.0511 0.0511 0.0511 0.0511 0.0511 0.0511 0.564 0.564 0.1168 0.1168

6 Mitigation of Magnetic Field Under Overhead Transmission Line 30 Magnetic Field Intensity (A/m)

Fig. 6.2 The effect of the spans’ numbers on the magnetic field intensity

at P1, single span: N=0

25

at P1, N=1,2,3 at P2, single span: N=0

20

at P2, N=1,2,3

15 10 5 0

0

5

10 15 20 25 30 Distance from the center phase (m)

35

40

30 Magnetic Field Intensity (A/m)

Fig. 6.3 The effect of the temperatures on the magnetic field intensity

75

P1 at 45 deg.

25

P1 at 15 deg. P2 at 45 deg.

20

P2 at 15 deg.

15 10 5 0

0

5

10 15 20 25 30 Distance from the center phase (m)

35

40

they far from the field points. But when the magnetic field intensity calculated at point P2 (6.1) and a distance away from the center phase, the effect of the spans’ number is of great effect (double), that due to the contribution of the catenary d2 which produced the same magnetic field intensity as the original span d in this case as explained in Fig. 6.1, and of course the catenary d1 have a small contribution in the calculated values of the magnetic field intensities in this case. Figure 6.3 shows the effects of the temperatures on the configuration of overhead transmission line conductors (sag) and hence on the calculated magnetic field intensity by using 3D integration technique with MTL technique. It is seen that as the sag increased with the increase in the temperatures (as indicated in Table 6.1), the magnetic field intensity also increased. Figure 6.4 shows the comparison between the magnetic filed calculated with both 2D straight line technique where the average conductors’ heights are used, and 3D integration technique with MTL technique. It is seen that the observed maximum error of -23.2959% (at point P1) and 49.877% at (point P2) is mainly due to the negligence of the sag effect on the conductors. Figure 6.5 shows the comparison between the magnetic field intensity calculated by using 3D integration technique with MTL technique with and without ground wires and with and without the short circuit mitigation loop. It is seen that, the observed maximum reduction of 1.9316% (at point P1) and 2.469%

76 30 Magnetic Field Intensity (A/m)

Fig. 6.4 Comparison between results of 2D and 3D techniques

A. Z. El Dein Mohammed Moussa

P1, at N=2 2D, Average Hights P2, at N=2

25 20 15 10 5 0 0

35

40

P1: Cond P1: Cond+Ground

25

P1:Cond+Ground+Mit.SC P2: Cond

20

P2: Cond+Ground P2:Cond+Ground+Mit.SC

15 10 5 0 –40

–30

–20 –10 0 10 20 Distance from the center phase (m)

30

40

50 Magnetic Field Intensity (A/m)

Fig. 6.6 Effect of the reactance Xs, inserted in the mitigation loop, on the calculated magnetic field

10 15 20 25 30 Distance from the center phase (m)

30 Magnetic Field Intensity (A/m)

Fig. 6.5 Comparison between the calculated magnetic field from the conductors only; the conductors and ground wires; and the conductors, ground wires and S. C. mitigation loop

5

P1

45

P2

40 35 30 25 20 15 10 5

–2

–1.8

–1.6

–1.4

–1.2 –1 –0.8 Xs (Ohm)

–0.6

–0.4

–0.2

0

(at point P2) is mainly due to the negligence of the ground wires. It is seen that with the short circuit mitigation loop placed 5 m below beneath the outer phase conductors, the magnetic field intensity reduced to a significant values, maximum reduction of 25.7063% (at point P1) and 30.1525% (at point P2). The magnetic field intensity can be reduced further by inserting an appropriately chosen series

6 Mitigation of Magnetic Field Under Overhead Transmission Line

77

Table 6.3 The effect of the mitigation loop heights on the calculated magnetic field intensity at point (P1) and 26.4 m mitigation loop spacing Height of mitigation loop Magnetic field (A/m) at P1 at distance from center phase equals 18 m Short circuit With optimal 19 m Short circuit With optimal 20 m Short circuit With optimal 21 m Short circuit With optimal 23 m Short circuit With optimal 24 m Short circuit With optimal 25 m Short circuit With optimal 26 m Short circuit With optimal 27 m Short circuit With optimal

-15 m

-10 m

0m

10 m

15 m

capacitance

15.03 9.42

17.77 10.80

20.83 17.52

19.74 18.84

16.83 16.1

capacitance

14.93 8.88

17.76 10.82

20.45 17.12

19.71 18.77

16.78 15.98

capacitance

14.64 8.13

17.52 10.43

19.94 16.63

19.49 18.64

16.57 15.84

capacitance

14.19 7.01

17.06 9.56

19.26 15.87

19.01 18.15

16.13 15.40

capacitance

14.10 7.07

17.01 9.86

19.00 16.35

19.31 19.16

16.43 16.41

16.64 capacitance 11.46

19.80 14.13

21.19 17.97

21.33 19.79

18.24 16.96

18.03 capacitance 13.95

21.33 16.87

22.46 19.55

22.53 20.85

19.31 17.91

18.95 capacitance 15.70

22.34 18.764

23.34 20.82

23.35 21.80

20.04 18.74

19.61 capacitance 16.98

23.08 20.16

24.00 21.84

23.96 22.58

20.58 19.42

capacitor in the mitigation loop, in order to determine the optimal capacitance Cs of the capacitor to be inserted in the mitigation loop, the magnetic field intensity calculated at point one meter above ground surface under center phase, considering different values of Zc where Zc = jXs, with the reactance Xs varies from -2X to 0. Figure 6.6 shows the graphical results of the effect of the reactance Xs, inserted in the mitigation loop, on the magnetic field intensity, from which it is seen that the optimal situation (minimum value of magnetic field intensity) is characterized by Cs = 4.897 mF, and worst situation (maximum value of magnetic field intensity) is characterizes by Cs = 2.358 mF. (Tables 6.3 and 6.4) depict the effect of the mitigation loop height on the calculated magnetic field intensity at points P1 and P2, respectively, when the mitigation loop spacing is 26.4 m (exactly under the outer phases). It is seen that the optimal height is one meter below the outer phase

78

A. Z. El Dein Mohammed Moussa

Table 6.4 The effect of the mitigation loop heights on the calculated magnetic field intensity at point (P2) and 26.4 m mitigation loop spacing Height of Mitigation loop Magnetic Field (A/m) at P2 at distance from center phase equals -15 m 18 m Short circuit With optimal 19 m Short circuit With optimal 20 m Short circuit With optimal 21 m Short circuit With optimal 23 m Short circuit With optimal 24 m Short circuit With optimal 25 m Short circuit With optimal 26 m Short circuit With optimal 27 m Short circuit With optimal

-10 m

0m

10 m

15 m

7.97 capacitance 5.49

8.89 6.28

9.88 7.77

9.43 7.99

8.61 7.44

7.77 capacitance 5.15

8.7 5.98

9.68 7.53

9.26 7.85

8.44 7.30

7.48 capacitance 4.67

8.4 5.54

9.37 7.21

9.00 7.64

8.20 7.12

7.09 capacitance 3.98

7.99 4.88

8.93 6.67

8.61 7.25

7.84 6.76

6.79 capacitance 3.91

7.71 4.93

8.72 6.98

8.49 7.72

7.72 7.22

8.06 capacitance 5.52

9.09 6.49

10.06 8.01

9.65 8.30

8.74 7.66

8.76 capacitance 6.67

9.85 7.68

10.82 8.99

10.32 9.01

9.33 8.25

9.23 capacitance 7.5

10.37 8.56

11.34 9.76

10.78 9.61

9.74 8.75

9.58 capacitance 8.13

10.75 9.22

11.73 10.37

11.13 10.09

10.05 9.17

conductors when the mitigation loop is short circuited and about one meter above the outer phase conductors when an optimal capacitance inserted in the mitigation loop. (Tables 6.5 and 6.6) depict the effect of the mitigation loop spacing on the calculated magnetic field intensity at points P1 and P2, respectively, when the mitigation loop height is 21 m. It is seen that the optimal spacing is the outer phase conductors spacing. Figure 6.7 shows the comparison between the calculated magnetic field intensity values result from; the conductors, ground wires and short circuit mitigation loop; and the conductors, ground wires and mitigation loop with optimal capacitance and optimal parameters obtained from (Tables 6.3, 6.4, 6.5 and 6.6). It is seen that the magnetic field intensity decreased further more, maximum reduction of 8.0552% (at point P1) and 19.5326% (at point P2).

6 Mitigation of Magnetic Field Under Overhead Transmission Line

79

Table 6.5 The effect of the mitigation loop spacings on the calculated magnetic field intensity at point (P1) and 21 m height Distance of mitigation loop from the center phase Magnetic field (A/m) at P1 at distance from center phase equals 5m Short circuit With optimal 7.5 m Short circuit With optimal 10 m Short circuit With optimal 13.2 m Short circuit With optimal 15 m Short circuit With optimal

-15 m

-10 m

0m

10 m

15 m

capacitance

21.54 20.77

25.03 24.07

24.88 23.28

25.56 24.83

22.20 21.75

capacitance

20.43 18.65

23.48 21.21

23.24 20.67

24.15 22.73

21.22 20.26

capacitance

18.35 14.77

20.90 16.58

21.42 18.25

21.98 20.19

19.45 18.01

capacitance

14.19 7.01

17.06 9.56

19.26 15.87

19.01 18.15

16.13 15.40

capacitance

14.57 7.66

18.22 11.28

20.51 17.14

20.19 19.17

16.69 16.12

Table 6.6 The effect of the mitigation loop spacings on the calculated magnetic field intensity at point (P2) and 21 m height Distance of mitigation loop from the center phase Magnetic field (A/m) at P2 at distance from center phase equals 5m Short circuit With optimal 7.5 m Short circuit With optimal 10 m Short circuit With optimal 13.2 m Short circuit With optimal 15 m Short circuit With optimal

-15 m

-10 m

0m

10 m

15 m

capacitance

10.69 10.21

11.89 11.32

12.79 12.16

12.17 11.73

11.06 10.72

capacitance

10.06 9.01

11.14 9.94

11.96 10.73

11.46 10.57

10.47 9.77

capacitance

9.01 7.13

9.97 7.93

10.77 8.91

10.38 9.05

9.52 8.44

capacitance

7.09 3.98

7.99 4.88

8.93 6.67

8.61 7.25

7.84 6.76

capacitance

7.28 4.34

8.33 5.38

9.41 7.24

8.98 7.69

8.08 7.10

80 18 Magnetic Field Intensity (A/m)

Fig. 6.7 Comparison between the calculated magnetic field intensity values result from the conductors, ground wires and short circuit mitigation loop; and from the conductors, ground wires and mitigation loop with capacitance of optimal value at optimal height and spacing

A. Z. El Dein Mohammed Moussa

16

P1:Cond+Ground+Mit.SC

14

P2:Cond+Ground+Mit.SC

12

P1:Cond+Ground+Mit. Opt. C P2:Cond+Ground+Mit. Opt. C

10 8 6 4 2 0 –40

–30

–20

–10

0

10

20

30

40

Distance from the center phase (m)

6.5 Conclusion The effects of the currents in the subconductors of each phase bundle, the currents in the ground wires, the currents in the mitigation loop, and also the earth return currents; in the calculation of the magnetic field are investigated by using the MTL technique. Furthermore, the effect of the conductor’s sag between towers, and the effect of sag variation with the temperature on the calculated magnetic field is studied. Finally the passive loop conductor design parameters, for Egyptian 500 kV overhead transmission line, are obtained at ambient temperature (35°C).

References 1. International Association of Engineers [Online]. Available: http://www.iaeng.org 2. El Dein AZ (2010) Mitigation of magnetic field under Egyptian 500 kV overhead transmission line. Lecture notes in engineering and computer science: Proceeding of the World Congress on Engineering, vol II WCE 2010, 30 June–2 July 2010, London, UK, pp 956–961 3. Hossam-Eldin AA (2001) Effect of electromagnetics fields from power lines on living organisms. In: IEEE 7th International Conference on Solid Dielectrics, June 25–29, Eindhoven, The Netherlands, pp 438–441 4. Karawia H, Youssef K, Hossam-Eldin AA (2008) Measurements and evaluation of adverse health effects of electromagnetic fields from low voltage equipments. MEPCON Aswan, Egypt, March 12–15, pp 436–440 5. Dahab AA, Amoura FK, Abu-Elhaija WS (2005) Comparison of magnetic-field distribution of noncompact and compact parallel transmission-line configurations. IEEE Trans Power Deliv 20(3):2114–2118 6. Stewart JR, Dale SJ, Klein KW (1993) Magnetic field reduction using high phase order lines. IEEE Trans Power Deliv 8(2):628–636 7. Yamazaki K, Kawamoto T, Fujinami H (2000) Requirements for power line magnetic field mitigation using a passive loop conductor. IEEE Trans Power Deliv 15(2):646–651 8. Olsen RG, Wong P (1992) Characteristics of low frequency electric and magnetic fields in the vicinity of electric power lines. IEEE Trans Power Deliv 7(4):2046–2053

6 Mitigation of Magnetic Field Under Overhead Transmission Line

81

9. Begamudre RD (2006) Extra high voltage AC. Transmission Engineering, third Edition, Book, Chapter 7, Wiley Eastern Limited, pp 172–205 10. Brandão Faria JA, Almeida ME (2007) Accurate calculation of magnetic-field intensity due to overhead power lines with or without mitigation loops with or without capacitor compensation. IEEE Trans Power Deliv 22(2):951–959 11. de Villiers W, Cloete JH, Wedepohl LM, Burger A (2008) Real-time sag monitoring system for high-voltage overhead transmission lines based on power-line carrier signal behavior. IEEE Trans Power Deliv 23(1):389–395 12. Noda T (2005) A double logarithmic approximation of Carson’s ground-return impedance. IEEE Trans Power Deliv 21(1):472–479 13. Ramirez A, Uribe F (2007) A broad range algorithm for the evaluation of Carson’s integral. IEEE Trans Power Deliv 22(2):1188–1193 14. Benato R, Caldon R (2007) Distribution line carrier: analysis procedure and applications to DG. IEEE Trans Power Deliv 22(1):575–583 15. Overhead Cable Sag Calculation Program http://infocom.cqu.edu.au/Staff/Michael_O_malley/ web/overhead_cable_sag_calculator.html 16. El Dein AZ (2009) Magnetic field calculation under EHV transmission lines for more realistic cases. IEEE Trans Power Deliv 24(4):2214–2222

Chapter 7

Universal Approach of the Modified Nodal Analysis for Nonlinear Lumped Circuits in Transient Behavior Lucian Mandache, Dumitru Topan and Ioana-Gabriela Sirbu

Abstract Recent approaches for time-domain analysis of lumped circuits deal with differential-algebraic-equation (DAE) systems instead of SPICE-type resistive models. Although simple and powerful, DAE models based on modified nodal approaches require some restrictions related to redundant variables or circuit topology. In this context, the paper proposes an improved version that allows treating nonlinear analog circuits of any topology, including floating capacitors, magnetically coupled inductors, excess elements and controlled sources. The procedure has been implemented in a dedicated program that builds the symbolic DAE model and solves it numerically.

7.1 Introduction The transient analysis of analog nonlinear circuits requires a numerical integration that is commonly performed through associated discrete circuit models (SPICEtype models). In this manner, resistive circuits are solved sequentially at each time

L. Mandache (&) I.-G. Sirbu Faculty of Electrical Engineering, University of Craiova, 107 Decebal Blv, 200440, Craiova, Romania e-mail: [email protected] I.-G. Sirbu e-mail: [email protected] D. Topan Faculty of Electrical Engineering, University of Craiova, 13 A.I. Cuza Str, 200585, Craiova, Romania e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_7, Springer Science+Business Media B.V. 2011

83

84

L. Mandache et al.

step [1–3]. Different strategies involve building of state or semistate mathematical models, as differential or differential–algebraic equation systems [4–6]. It is solved by specific numerical methods without engaging equivalent circuit models. Therefore, the problem of circuit analysis is transferred to a pure mathematical one. The latter strategy was extended during the last decades, taking advantage of the progress of the information technology [7–12]. The paper is focused on a semistate equations-based method associated to the modified nodal approach. This method avoids singular matrices in the equation system and overcomes the restriction related to floating capacitors [1, 14]. It also benefits by our previously developed topological analysis based on a single connection graph [3, 13] instead of two or more graphs [2, 10], although the circuit contains controlled sources. A simple, robust and comprehensive method is obtained. The semistate mathematical model corresponding to the modified nodal approach (MNA) in the time domain has the general form of an ordinary differential equation system: Mðx; tÞ x_ ðtÞ þ Nðx; tÞ xðtÞ ¼ f ðx; tÞ ; ð7:1Þ xðt0 Þ ¼ x0 : The vector of circuit variables xðtÞ, with the initial value x0 , contains the vector of the node voltages vn1 and the vector of the branch currents im that can not be expressed in terms or node voltages and/or theirs first-order derivatives: vn1 ðtÞ xðtÞ ¼ : ð7:2Þ im ðtÞ Therefore, the vector im contains the currents of zero-impedance elements (socalled MNA-incompatible elements): independent and controlled voltage sources, the controlling currents of current controlled sources, the inductor currents and the currents of the current controlled nonlinear resistors [1, 3, 5, 13]. Mðx; tÞ and Nðx; tÞ are square and generally state and time dependent matrices containing the parameters of the nonlinear elements. The matrix M contains the inductances and capacitances of energy storage circuit elements (dynamic parameters for the nonlinear storage elements) while the matrix N contains the resistances and conductances of resistors (dynamic parameters for the nonlinear resistors). Since the matrix M is commonly singular, the mathematical model (7.1) requires a special treatment. f ðx; tÞ contains the circuit excitations and the parameters associated to the incremental sources used in the local linearization of the nonlinear resistors: current sources for voltage controlled nonlinear resistors and voltage sources for current controlled nonlinear resistors. Although the building of the mathematical model (7.1) is relatively simple, the existence of a unique solution is debatable in most cases (the problem of possible singularity of the matrix M has been already reported).

7 Universal Approach of the Modified Nodal Analysis

85

The paper is organized as follows: the Sect. 7.2 explains the problem of floating capacitors related to the existence and uniqueness of the solution of the Eq. 7.1; the improved version of the MNA, in order to obtain an equivalent well-posed equation, is described in Sect. 7.3 and an example is treated in the Sect. 7.4.

7.2 The Problem of Floating Capacitors Related to MNA The discussion on the floating capacitors requires building the circuit connection graph. We agree the single-graph procedure with its specific preliminary actions related to appropriate modeling of controlling ports of controlling sources [3]. Since the connection graph was built and the ground node was chosen, the capacitor subgraph is simply extracted. If the circuit capacitors subgraph is not connected, then redundant variables appear in Eq. 7.1 and the circuit response can not be computed, as shown below. The DAE mathematical model based on the MNA requires that any linear or nonlinear capacitor to be linked to the ground node through a path of capacitors. A capacitor that does not accomplish this requirement is called floating capacitor (see Fig. 7.1). The time-domain nodal equations for such a structure are: 8 X > ij ¼ 0 ; Ck v_ p Ck v_ q þ > > < j2ðpÞ ð7:3Þ X > > is ¼ 0 ; > Ck v_ p þ Ck v_ q þ : s2ðqÞ

where the state matrix Mk ¼

Ck Ck

Ck ; Ck

ð7:4Þ

is obviously singular. Therefore, such a mathematical model is inappropriate. If one of the equations (7.3) is replaced by the cutset current law expressed for the cutset R surrounding the floating capacitor: 8 P > Ck v_ p Ck v_ q þ ij ¼ 0 ; > > < j2ðpÞ P P ð7:5Þ ij þ is ¼0 ; > > > : j2ðpÞ s2ðqÞ j6¼k

s6¼k

then the singular matrix is avoided. The second equation in (7.5) can be obtained simply by adding both nodal equations (7.3). Nevertheless, the equation system (7.5) contains a redundant variable because two dynamic variables are involved by only one differential equation. Moreover, these two state variables correspond to

86

L. Mandache et al. Σ

Fig. 7.1 Floating capacitor

(p) ij

vp

ik

Ck

(q) vq

is

only one capacitor. If the capacitor is grounded, with vq ¼ 0, then the nodal equation associated to the node (p) becomes Ck v_ p þ

X

ij ¼ 0

ð7:6Þ

j2ðpÞ

and the problem of singular matrix or redundant variable does not appear. Extrapolating the above reasoning, if any subgraph of capacitors is floating, then the number of variables exceeds the number of essential capacitors, one variable being redundant. In order to overcome the problem of redundant variable, a change of variables will be performed: the node voltages will be replaced by the tree-branch voltages.

7.3 Improved Version of the MNA To overcome the problem of singular matrices and redundant variables introduced by the floating capacitors, our method requires accomplishing three main steps: Step 1 Step 2

Step 3

Build the modified nodal equations by ignoring the floating capacitor problem; Identify all subgraphs of floating capacitors, as well as the nodal equations related to theirs nodes [1, 3]; for each such a subgraph, replace one of the nodal equations by the cutset current law expressed for the cutset surrounding the subgraph, as the second Eq. 7.5; an equivalent mathematical model is obtained, with the general form similar to (7.1): v_ n1 ðtÞ vn1 ðtÞ 0 0 M _ þN ¼ f 0: ð7:7Þ im ðtÞ im ðtÞ Perform a change of variables: the vector of the node voltages vn1 is replaced by the vector of the tree-branch voltages ut , the vector im remaining unchanged.

7 Universal Approach of the Modified Nodal Analysis

87

As it is known, the MNA does not require finding a normal tree of the given circuit. Nevertheless, in order to perform the change of variables a normal tree is required. We developed previously a simple and efficient method to build normal trees systematically [3, 13], that requires only few preliminary adjustments in the circuit diagram, as: the controlling branches of voltage-controlled sources must be modeled by zero-independent current sources and the controlling branches of current-controlled sources must be modeled by zero-independent voltage sources. The magnetically coupled inductors need to be modeled through equivalent diagrams with controlled sources. Thus, the normal tree is necessary for identifying the excess capacitors and inductors. Since a normal tree was found, the step 3 of our algorithm can be performed. The node-branch incidence matrix is partitioned depending on the tree/cotree branches: A ¼ ½ At

j Ac ;

ð7:8Þ

where At corresponds to the tree branches and Ac corresponds to the cotree branches [3, 13]. Next, the tree-branch voltages may be expressed in terms of nodes voltages [2, 13] using the transpose of the square nonsingular matrix At : ut ¼ Att vn1 :

ð7:9Þ

Since the existence of the normal tree guarantees that the matrix At is always square and nonsingular, and consequently invertible, the node voltages of (7.9) can be expressed in terms of the tree-branch voltages: vn1 ¼ A0 ut ;

ð7:10Þ

where A0 signifies the inverse matrix of Att . It is noticeable that the inverse matrix A0 can be obtained relatively simple, due to the sparsity of At with the nonzero elements equal to þ1 or 1. Using (7.10) to substitute the vector vn1 in (7.7), the mathematical model becomes 0 0 u_ u A 0 A 0 _ t þ N0 t ¼ f0 M0 ð7:11Þ im im 0 1 0 1 or M 00 ðx0 ; tÞ x_ 0 ðtÞ þ N 00 ðx0 ; tÞ x0 ðtÞ ¼ f 0 ðx0 ; tÞ

ð7:12Þ

where obvious notations were used. The new vector of variables is x0 ðtÞ. We extract from x0 the essential capacitor voltages and the essential inductor currents, as elements of the state vector of length s (the subscript s comes from ‘‘state’’): u ð7:13Þ xs ¼ C : iL

88

L. Mandache et al.

The remained elements of x0 are grouped in the vector xa . The vector of variables organized as above involves splitting the equation system (7.12) as follows:

ð7:14Þ

We remark that only the partition Mss of size s s of the matrix M 00 is nonsingular, all other elements being zeros. A differential–algebraic equation system has been emphasized ( Mss x_ s ðtÞ þ Nss xs ðtÞ þ Nsa xa ðtÞ ¼ fs ðxs ; xa ; tÞ ; ð7:15Þ Nas xs ðtÞ þ Naa xa ðtÞ ¼ fa ðxs ; xa ; tÞ ; with the initial condition

uC ðt0 Þ xs ðt0 Þ ¼ : iL ðt0 Þ

ð7:16Þ

Therefore, the vector xs contains the variables of the differential equation system, while xa contains the variables of the algebraic equation system (the subscript a comes from ‘‘algebraic’’). In order to find the time-domain solution, many numerical techniques suitable for DAE can be used. In principle, the computation procedure requires the discretization of the analysis time and running the following steps: • Solve the algebraic equation from (7.15), assigning to the state variables the initial values: Nas xs ðt0 Þ þ Naa xa ¼ fa

ð7:17Þ

in order to find the solution xa ðt0 Þ. • Perform a numerical integration of the differential equation from (7.15), for the first discrete time interval ðt0 ; t1 Þ, assigning the value xa ðt0 Þ to the vector xa and considering (7.16) as initial condition: ( Mss x_ s þ Nss xs þ Nsa xa ðt0 Þ ¼ fs ; ð7:18Þ xs ðt0 Þ ¼ xs0 : The solution xs ðt1 Þ is obtained. • At the time step k the algebraic equation is solved, assigning to the state variables the values xs ðtk Þ calculated previously, during the numerical integration on the time interval ðtk1 ; tk Þ:

7 Universal Approach of the Modified Nodal Analysis

Nas xs ðtk Þ þ Naa xa ¼ fa :

89

ð7:19Þ

The solution xa ðtk Þ is found. • Perform a numerical integration of the differential equation, for the next discrete time interval ðtk ; tkþ1 Þ, assigning the previously computed value xa ðtk Þ to the vector xa , and considering as initial condition the values xs ðtk Þ: ( Mss x_ s þ Nss xs þ Nsa xa ðtk Þ ¼ fs ; ð7:20Þ xs ðtk Þ ¼ xsk : The solution xs ðtkþ1 Þ is obtained. The last two steps are repeated until the final moment of the analysis time is reached. It is noticeable that the efficiency of the iterative algorithms used for nonlinear algebraic equation solving is significantly enhanced if xa ðtk1 Þ is considered as start point. The above described method has been implemented in a computation program under the high performance computing environment MATLAB. It recognizes the input data stored in a SPICE-compatible netlist, performs a topological analysis in order to build a normal tree and incidence matrices, identifies the excess elements and floating capacitors, builds the symbolic mathematical model as in expression (7.15), solves it numerically and represents the solution graphically.

7.4 Example Let us study the transient behavior of an electromechanical system with a brushed permanent magnet DC motor supplied by a half wave uncontrolled rectifier, the mechanical load being nonlinear. The equivalent diagram built according to the transient model is shown in Fig. 7.2. There is not our goal to explain here the correspondence between the electromechanical system and the equivalent circuit diagram, or to judge the results from the point of view of its technical use. Only the algorithm described above will be emphasized. The diagram contains one floating capacitor (branch 18) and two nonlinear resistors (the current controlled nonlinear resistor of the branch 11 is the model of the nonlinear mechanical load, reproducing the speed-torque curve, and the voltage controlled nonlinear resistor of the branch 16 corresponds to the semiconductor diode), whose characteristics are shown in Fig. 7.3. The independent zero-current sources 9, 10 and 14 correspond to the controlling branches of the voltage controlled sources 5, 13 and 7 respectively, while the independent zerovoltage source 4 corresponds to the controlling branch of the current controlled current source of the branch 6. The circuit is supplied by the independent sinusoidal voltage source of the branch 1.

90

L. Mandache et al. 8 17

18

5 2

7 16

u9 = u14

2 14

9

1 3

6

7

B6,4 i4

19 3 9

G7,10 u10

10 6

4

15

u10

4

1

8

5

10

11 Ri

12

13

G13,9 u9

A5,9 u9 10

10

Permanent magnet DC motor

Fig. 7.2 Circuit example Diode / branch 16

Nonlinear res. / branch 11 0. 2

2 1. 5

0. 1

[A]

[V]

1 0

0. 5 -0. 1

-0. 2 -60

0

-40

-20

0

20

40

60

-0.5 -1

[A]

-0.5

0

0. 5

1

1. 5

2

[V]

Fig. 7.3 Nonlinear resistors characteristics

If the node 10 is grounded, the topological analysis performed by our computing program gets the result: The circuit does not contain excess inductors Normal tree branches: 1 4 5 8 15 18 16 2 12 MNA-incompatible branches: 1 3 4 5 11 Floating capacitor subgraph 1: reference_node: 8 other_nodes: 7 Therefore, the semistate variables are: xs ¼ ½u8 ; u15 ; u18 ; i3 t , and the variables of the algebraic equation system are: xa ¼ ½u1 ; u4 ; u5 ; u16 ; u2 ; u12 ; i1 ; i4 ; i5 ; i11 t . The computing program gets the mathematical model in the symbolic form of type (7.15):

7 Universal Approach of the Modified Nodal Analysis

91

• The differential equation system: 8 C15 Du15 ¼ G17 u18 ðG17 Gd16Þ u16 G2 u2 þ J0R16 > > > < C8 Du8 ¼ G7 10 u12 B6 4 i4 þ J9 þ J14 > C18 Du18 ¼ G19 u15 G19 u1 ðGd16 þ G19Þ u16 J0R16 > > : L3 Di3 ¼ u15 þ u4 u5 þ u2 • The algebraic equation system: 8 i3 G2 u2 ¼ 0 > > > > > i3 þ i4 ¼ 0 > > > > > > i4 i5 ¼ 0 > > > > > G13 14 u8 þ G12 u12 þ i11 þ J10 ¼ 0 > > > < G19 u15 G17 u18 þ G19 u1 þ ðG17 þ Gd16 þ G19Þ u16 þ J0R16 ¼ 0 > G19 u15 G19 u1 G19 u16 i1 ¼ 0 > > > > > u1 þ E sin 1 ¼ 0 > > > > > u4 þ E4 ¼ 0 > > > > > A5 9 u8 þ u5 ¼ 0 > > > : u12 Rd11 i11 E0R11 ¼ 0

Since the mathematical model above is given by the computing program automatically, some unobvious notations are used (e.g. the first derivative of the state variable u15 – Du15; the conductance of the nonlinear resistance of the branch 16 – Gd16; the incremental current source used in the local linearization of the nonlinear voltage-controlled nonlinear resistance of the branch 16 – J0R16; the voltage gain of the voltage-controlled voltage source of the branch 5 controlled by the branch 9 – A5 9). Assuming zero-initial conditions, the solving algorithm gets the result as timedomain functions, some examples being given in Fig. 7.4. Although the analysis time was 800 ms in order to cover the slowest component of the transient response, only details for the first 100 ms are shown in Fig. 7.4. The DAE system has been solved using a Gear’s numerical integration algorithm with variable time step combined with a Newton–Raphson algorithm. With the computation errors restricted to the limit values of 107 (absolute value) and 104 (relative value), the time step of the numerical integration process was maintained between 47 ns and 558 ls. We remark that the same results have been obtained through a witness SPICE simulation, using the version ICAP/4 from Intusoft [15].

L. Mandache et al. 12

15

10

10

8

5

u15 [V]

u8 [V]

92

6

0

4

-5

2

-10

0

0

0. 02

0. 04

0. 06

0. 08

-15

0. 1

0

0. 02

Time [s]

0. 04

0. 06

0. 08

0. 1

0. 08

0. 1

Time [s] 4

5 0

3

-10

i3 [A]

u18 [V]

-5

-15

2 1

-20 0 -25 -30

0

0. 02

0. 04

0. 06

0. 08

Time [s]

0. 1

-1

0

0. 02

0. 04

0. 06

Time [s]

Fig. 7.4 Example of analysis results

7.5 Conclusion An efficient and totally feasible algorithm intended to the time-domain analysis of nonlinear lumped analog circuits was developed and implemented in a computation program. It overcomes some restrictions of the modified nodal approaches, having practically an unlimited degree of generality for RLCM circuits. The algorithm benefits by the simplicity of the MNA and the numerical methods for solving the mathematical model are flexible and can be optimized without requiring any companion diagrams (as the SPICE-like algorithms). In this manner, the computation time and the computer requirements can be reduced as compared to other methods. Our contribution is proven by an example, the chosen circuit containing nonlinear resistors, floating capacitors and controlled sources. Acknowledgments This work was supported in part by the Romanian Ministry of Education, Research and Innovation under Grant PCE 539/2008.

7 Universal Approach of the Modified Nodal Analysis

93

References 1. Mandache L, Topan D (2010) Improved modified nodal analysis of nonlinear analog circuits in the time domain. Lecture notes in engineering and computer science, vol 2184— Proceedings of the World Congress on Engineering – London UK, June 30–July 2, pp 905–908 2. Chua LO, Lin PM (1975) Computer-aided analysis of electronic circuits–algorithms and computational techniques. Prentice-Hall, Englewood Cliffs 3. Iordache M, Mandache L (2004) Computer-aided analysis of nonlinear analog circuits (original title in Romanian) Ed. Politehnica Press, Bucharest (in Romanian) 4. Hodge A, Newcomb R (2002) Semistate theory and analog VLSI design. IEEE Circuits Syst Mag Second Quart 2(2):30–51 5. Newcomb R (1981) The semistate description of nonlinear time-variable circuits. IEEE Tram Circuits Syst CAS-28(1):62–71 6. Ho CW, Ruehli AE, Brennan PA (1975) The modified nodal approach to network analysis. IEEE Tram Circuits Syst CAS-22:504–509 7. Yamamura K, Sekiguchi T, Inoue Y (1999) A fixed-point homotopy method for solving modified nodal equations. IEEE Trans Circuits Syst - I: Fundam Theory Appl 46(6):654–665 8. Brambilla A, Premoli A, Storti-Gajani G (2005) Recasting modified nodal analysis to improve reliability in numerical circuit simulation. IEEE Trans Circuits Syst I: Regul Pap 52(3):522–534 9. Lee K, Park SB (1985) Reduced modified nodal approach to circuit analysis. IEEE Trans Circuits Syst 32(10):1056–1060 10. Chang FY (1997) The unified nodal approach to circuit analysis. In: IEEE International Symposium on Circuits and Systems, June 9–12, 1997, Hong Kong, pp 849–852 11. Hu JD, Yao H (1988) Generalized modified nodal formulation for circuits with nonlinear resistive multiports described by sample data. In: IEEE International Symposium on Circuits and Systems, vol 3, 7–9 June 1988, pp 2205–2208 12. Kang Y, Lacy JG (1992) Conversion of MNA equations to state variable form for nonlinear dynamical circuits. Electron Lett 28(13):1240–1241 13. Topan D, Mandache L (2007) Special matters of circuit analysis (original title in Romanian). Universitaria, Craiova (in Romanian) 14. Mandache L, Topan D (2003) An extension of the modified nodal analysis method. In: European Conference on Circuit Theory and Design ECCTD ‘03, September 1–4 2003, Kraków, pp II-410–II-413 15. *** ICAP/4—Is SPICE 4 User’s guide (1998) Intusoft. San Pedro, California USA

Chapter 8

Modified 1.28 Tbit/s (32 3 4 3 10 Gbit/s) Absolute Polar Duty Cycle Division Multiplexing-WDM Transmission Over 320 km Standard Single Mode Fiber Amin Malekmohammadi

Abstract A new version of Absolute Polar Duty Cycle Division Multiplexing transmission scheme over Wavelength Division Multiplexing system is proposed. We modeled and analyzed a method to improve the performance of AP-DCDM over WDM system by using Dual-Drive Mach–Zehnder-Modulator (DD-MZM). Almost 4.1 dB improvement in receiver sensitivity of 1.28 Tbit/s (32 9 40 Gbit/s) AP-DCDM-WDM over 320 km fiber is achieved by optimizing the bias voltage in DD-MZM.

8.1 Introduction Wave length division multiplexing technologies have enabled the achievement of ultra high capacity transmission over 1 Tbit/s using Erbium Doped Fiber Amplifier (EDFA’s). To pack a Tbit/s capacity into the gain bandwidth, spectral efficiency has to be improved. Narrow filtering characteristics and a high stability for the center frequency of optical filters are required to achieve dense WDM systems. Although such narrow optical filters could be developed [1, 2], narrow filtering of the signal light would result in wave form distortion in the received signal. Thus compact spectrum signals are also required for reducing distortion due to narrow filtering. Absolute Polar Duty Cycle Division Multiplexing (AP-DCDM) is an alternative multiplexing technique which is able to support many users per WDM channel

A. Malekmohammadi (&) Department of Electrical and Electronic Engineering, The University of Nottingham, Malaysia Campus, Kuala Lumpur, Malaysia e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_8, Ó Springer Science+Business Media B.V. 2011

95

96

A. Malekmohammadi

[3, 4]. Therefore, as reported in [4] the capacity of the WDM channels can be increased tremendously by using this technique. AP-DCDM enables us to use narrow optical filters that will provide spaces to increase the channel count. AP-DCDM system has intrinsic sensitivity penalty as compared to the binary signal, due to fragmentation of the main eye to smaller eyes [3]. At the same received power, these small eyes have different quality; therefore cause different AP-DCDM channels to have different performances, which is not desirable in telecommunication systems [3, 4]. In this paper, Dual-Drive Mach–Zehnder-Modulator (DD-MZM) is used in AP-DCDM-WDM setup at 1.28 Tbit/s transmission systems in order to improve the performance of AP-DCDM-WDM transmission system. It is shown that by optimum adjustment of the bias voltage at both ports, the sensitivity of the worst channel in AP-DCDM in 1.28 Tbit/s AP-DCDM-WDM over 320 km SSMF can be improved by 4.1 dB. Mach–Zehnder modulators have the important feature that the chirp of the transmitted signal is a function of the electro optic properties of the p-i-n waveguide, the splitting ratios of the two branch waveguides, the differential length between the two arms of the interferometer, and the format of the modulating voltages applied to the arm electrodes [5–7]. An important property of DD–MZM is that, due to the quantum confined stark effect, the attenuation and phase constants of an optical signal propagating in the p-i-n waveguide are nonlinear functions of the applied voltage. Since these constants determine the modulator extinction ratio and chirp, the bias and modulation voltages can be optimized to yield the minimum degradation in receiver sensitivity due to fiber dispersion and self-phase modulation [6, 8].

8.2 Conventional 32-Channel AP-DCDM-WDM Transmission As shown in Fig. 8.1 the evaluation starts with four AP-DCDM channels (4 9 10 Gbit/s) with PRBS of 210-1 (Fig. 8.1a) and followed by 32 WDM channels (32 9 4 9 10 Gbit/s) (Fig. 8.1b). In Fig. 8.1 four OOK channels were multiplexed by using AP-DCDM, whose outputs are multiplexed by using WDM technique (each WDM channel contains 4 9 10 Gbit/s with PRBS of 210-1 as shown in Fig. 8.1a). 62.5 GHz (0.5 nm) channel spacing was used. As a result, 128 AP-DCDM channels (32 9 4) are multiplexed in 32 WDM channels (k1 to k32) within *15.5 nm (1550–1565.5 nm) EDFA band. WDM spectral efficiency of 0.64 bit/s/ Hz was achieved without polarization multiplexing [7]. The transmission line was 4 spans of 80 km Standard Single Mode Fiber (SSMF) followed by a 13.4 km Dispersion Compensation Fiber (DCF). The length ratio between SSMF and DCF is optimized so that the overall second-order dispersion reaches zero. For the SSMF, the simulated specifications for dispersion (D), dispersion slop (S), attenuation coefficient (a), effective area (Aeff) and nonlinear index of refraction (n2) are

8 Absolute Polar Duty Cycle Division Multiplexing

97

Fig. 8.1 a 4 9 10 Gbit/s AP-DCDM transmission system. b Simulation setup of 1.28 Tbit/s (32 9 4 9 10 Gbit/s) AP-DCDM-WDM transmissions. c Optical spectrum before transmission. d Optical spectrum after transmission. e Single channel AP-DCDM spectrum

16.75 ps/nm/km, 0.07 ps/nm2/km, 0.2 dB/km, and 80 lm2 and 2.7 9 10-20 m2/W respectively. For DCF, D of * -100 ps/nm/km, S of -0.3 ps/nm2/km, a of 0.5 dB/km, Aeff of 12 lm2 and n2 of 2.6 9 10-20 m2/W are used. For Booster and pre-amplifier, an erbium-doped fiber amplifier (EDFA) with a flat gain of 30 dB and a noise figure (NF) of 5 dB was used. The total power to the booster is 8.35 dBm and launch power into SSMF is +15 dBm (*0 dBm/channel). The Self Phase Modulation (SPM) effect in the link could be neglected since the launched power into the SSMF and DCF was less than the SPM threshold.

98

A. Malekmohammadi

Figure 8.1c, d show the optical spectra of 32 WDM channels before and after transmission respectively. The effect of Four Wave Mixing (FWM) is negligible due to the phase mismatch in the highly dispersive transmission line [9, 10]. Figure 8.2a shows the exemplary eye diagrams taken after the 320 km SSMF (4 span of 80 km SSMF ? 13.4 km DCF) for the worst channel (channel 16) of WDM system which contains 4 9 10 Gb/s AP-DCDM. As illustrated in Fig. 8.2 and reported in [7], the generated eye diagram for channel 16 which contains 4-channel of AP-DCDM system contains 6 small eyes. Eyes 1, 2, 3 and 4 (slots 1 and 2) correspond to the performance of AP-DCDM channel 1, eyes 2, 4 and 5 (slots 2 and 3) are related to performance of AP-DCDM channel 2, eyes 5 and 6 (slots 3 and 4) influence the performance of AP-DCDM channel 3, and eye 6 (slot 4) is related to AP-DCDM channel 4. As illustrated in Fig. 8.2a, at -25 dBm received power, Q-factor of all four eyes located at the first level is more than 6, which are higher than that of the eyes located at the second level (around 3.6 and 3.8 for eyes 1 and 2, respectively). The eye openings at different levels are almost similar but have different Q-factors due to different standard deviation of the noise variation at each level. Therefore, at the same received power, channel with minimum variation of noise has the best performance (e.g. channel 4) and the channel with maximum variation has the worst performance (channel 1).

Fig. 8.2 a Received eye diagram for channel 16 in 32-channel AP-DCDM-WDM system. b Received eye diagram for channel 16 in 32-channel AP-DCDM-WDM system using optimized DD-MZM

8 Absolute Polar Duty Cycle Division Multiplexing

99

8.3 Dual Drive-Mach–Zehnder Modulator The Dual Drive-Mach–Zehnder modulator consists of an input Y-branch (splitter), two arms with independent drive electrodes, and an output Y-branch (combiner). The optical signal incident on the input Y-branch is split into the two arms of the interferometer. When the signals recombine at the output Y-branch, the on-state is achieved when there is no differential phase shift between the two signals and the off-state is achieved when there is a differential phase shift of radians. The total optical field at the output of the Y-branch combiner is, to a good approximation, the sum of the fields at the outputs of the two arms. If the splitting ratios of the input and output Y-branches are identical, the output of the modulator is given by [6] D aa ðV1 Þ E0 SR: exp þ jDbðV1 Þ L 1 þ SR 2 D aa ðV2 Þ þ jDbðV2 Þ L j U0 þ exp 2 pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ¼ IðV1 ; V2 Þ expðj u0 ðV1 ; V2 ÞÞ

EðV1 ; V2 Þ ¼

where SR = P1/P2 is a Y-branch power splitting ratio; Daa/2 is attenuation constant; Db, phase constant; U0, ‘0’ radian for conventional modulator and ‘X’ radians for a X phase shift modulator; V1 and V2 are voltages applied to arms 1 and 2 respectively; I is the intensity of the optical signal; and U is the phase. For i = 1, 2 Vi ðtÞ ¼ Vbi þ Vmodi vðtÞ where Vbi is the bias voltage; Vmodi peak-to-peak modulation voltage; V(t) modulation waveform with a peak-to-peak amplitude of one and an average value of zero. The dependence of the attenuation and phase constants on the applied voltage can be obtained either by direct measurement of a straight section of waveguide cut from one arm of a modulator [5] or by using measurements of the voltage dependence of the intensity of the output signal for each arm with the other arm strongly absorbing [6–8]. Referring to Sect. 8.2, the improvement in the system performance can be obtained by having optimum amplitude distribution among the AP-DCDM signal level. This can be achieved by optimization in amplitude control of the level. To satisfy that requirement, we implement DD-MZM, which consists of an input Y-branch splitter, two arms with independent drive electrodes, and an output Y-branch combiner, in our setup as a replacement to conventional single-drive amplitude modulator (AM).

100 Table 8.1 DD-MZM optimization process for (a) Vb2, (b) Vb1

A. Malekmohammadi Setup (a) Conventional AP-DCDM MZM, Vb2 = -1 v MZM, Vb2 = -0.8 v MZM, Vb2 = -0.6 v MZM, Vb2 = -0.4 v (b) MZM, Vb1 = -3 v MZM, Vb1 = -2.9 v MZM, Vb1 = -2.8 v MZM, Vb1 = -2.6 v

Q1

Q2

Q3

Q4

Q5

Q6

3.6 4.4 5.1 5.9 6.4

3.8 4.7 5.4 6.1 5.9

6.7 6.3 6.2 6.1 6.9

6.7 6.5 6.4 6.3 6.1

6.7 6.4 6.2 6.2 6

6.4 6.1 6.1 6 5.8

6.5 5.9 5.5 4.5

5.5 6.1 5.8 5

7 6.1 6.5 6.5

6.5 6.3 6.3 6.5

6 6.2 6.2 6.2

5.5 6 6 6

8.4 Optimizing the DD-MZM for 1.2 Tbit/s AP-DCDM-WDM Transmission As discussed in Sect. 8.2 we need to have almost similar Q-factor for all 6 eyes to achieve similar performance for all channels. This can be done by improving the eye quality in second level. In order to change the eye high in second level while maintaining the maximum power, the bias voltage 1 (Vb1) and voltage 2 (Vb2) in DD-MZM need to be optimized so that the eye high in first level is reduced while increasing the eye high of the second level. The optimum bias voltages are considered for two different conditions for the worst channel in 32 channel AP-DCDM-WDM system (Channel 16) as shown in Table 8.1. The dependence of Q-factor for all 6 eyes on the Vb2 is shown in Table 8.1a (top) at the fixed received power of -25 dBm (receiver sensitivity of best channel), fixed Vb1(-2.9) and splitting ratio of 1.3. It can be seen from Table 8.1a that the optimum Vb2 is around -0.6 V where eye1 to eye 6 have almost similar Q-factors of 5.9, 6.1, 6.1, 6.3, 6.2, 6 respectively. The variation of Q-factor for different values of the Vb1 with fixed Vb2 (-2.9) is shown in Table 8.1b. As illustrated in Table 8.1b, the optimum Vb1 is around 2.9 where all eyes have similar Q-factor. Referring to Table 8.1 under optimized voltage biased conditions, the variation in the Q-factor is quite small and it is expected that the optimum sensitivity is essentially similar for all multiplexed channels [11].

8.5 32 Channels AP-DCDM-WDM System Performance Using Optimized DD-MZM The simulation results are obtained by replacing AM in Fig. 8.1 by optimized DD-MZM for all 32 channels. The optimized DD-MZM was fixed with splitting

8 Absolute Polar Duty Cycle Division Multiplexing

101

Fig. 8.3 Pre-amplified receiver sensitivity versus signal wavelength for 32 channels

ratio (SR) of 1.3, Vb1 of -2.9 V and Vb2 of -0.6 V. Figure 8.2b shows the exemplary eye diagrams taken after the 320 km SSMF (4 span of 80 km SSMF ? 13.4 km DCF) for Channel 16. As illustrated, although the eye highs are different, the Q-factors are almost the same. Compared to AP-DCDM with AM, Q-factors related to the second level are greatly improved (from 3.6 and 3.8 to 5.9 and 6.1 for eye 1 and 2 respectively). Note that the maximum amplitude values for AM and DD-MZM eye diagrams are the same. By improving the quality of the second level eyes, the performance of worse users (user 1 and 2) in middle WDM channel (Ch. 16) is significantly improved. In addition to that, we can have almost the same performance for all channels. Figure 8.3 shows and compares the receiver sensitivity of both AP-DCDMWDM with AM and the one with optimized DD-MZM for all 32 channels. The degradation of receiver sensitivity is caused by the accumulated spontaneous emission light from each LD through the multiplexing process and by noise figure (NF) of the pre-amplifier. As shown in Fig. 8.3, the receiver sensitivity was around -21 dBm for conventional AP-DCDM-WDM system and the variation between the channels was around 1.5 dB. As illustrated in Fig. 8.3 the receiver sensitivity of proposed AP-DCDM-WDM system was improved to around -25.1 dBm compare to conventional AP-DCDM-WDM system. Therefore the proposed solution improves the receiver sensitivity by around 4.1 dB. Figure 8.4 shows the improvement of OSNR for proposed AP-DCDM-WDM system compare to conventional AP-DCDM-WDM at BER of 10-9. The reason

102

A. Malekmohammadi

Fig. 8.4 OSNR versus signal wavelength for 32 channels

for this receiver sensitivity and OSNR improvement can be realized by looking and comparing the received eye diagrams depicted in Fig. 8.2a, b.

8.6 Conclusion We have presented the performance of 1.28 Tbit/s AP-DCDM over WDM technique when drive voltages of DD-MZM are optimized. In comparison to the previous report [7], considerable receiver sensitivity improvement (4.1 dB) was achieved. The improvement is due to the eye high increment, which leads towards Q-factor enhancement. These results are impactful in the exploration for the optimum AP-DCDM transmission system.

References 1. Kim H, Essiambre R-J (2003) Transmission of 8 9 20 Gb/s DQPSK signals over 310-km SMF with 0.8-b/s/Hz spectral efficiency. IEEE Photon Technol Lett 15(5):769–771 2. Winzer P, Essiambre R (2006) Advance modulation formats for high-capacity optical transport networks. J Lightw Technol 24:4711–4728 3. Malekmohammadi A, Abdullah MK, Abas AF, Mahdiraji GA, Mokhtar M (2009) Analysis of RZ-OOK over absolute polar duty cycle division multiplexing in dispersive transmission medium. IET Optoelectron 3(4):197–206

8 Absolute Polar Duty Cycle Division Multiplexing

103

4. Malekmohammadi A, Abas AF, Abdullah MK, Mahdiraji GA, Mokhtar M, (2009) Realization of high capacity transmission in fiber optic communication systems using absolute polar duty cycle division multiplexing (AP-DCDM) technique. Opt Fiber Technol 15(4):337–343 5. Cartledge C (1999) Optimizing the bias and modulation voltages of MQW Mach–Zehnder modulators for 10 Gb/s transmission on nondispersion shifted fiber. J Light Tech 17: 1142–1151 6. Adams DM, Rolland C, Fekecs A, McGhan D, Somani A, Bradshaw S, Poirier M, Dupont E, Cremer E, Anderson K (1998) 1.55 lm transmission at 2.5 Gbit/s over 1102 km of NDSF using discrete and monolithically integrated InGaAsP/InP Mach–Zehnder modulator and DFB laser. Electron Lett 34:771–773 7. Malekmohammadi A, Abas AF, Abdullah MK, Mahdiraji GA, Mokhtar M, Rasid M (2009) AP-DCDM over WDM system. Opt Commun 282:4233–4241 8. Hoshida T, Vassilieva O, Yamada K, Choudhary S, Pecqueur R, Kuwahara H (2002) Optimal 40 Gb/s modulation formats for spectrally efficient long-haul DWDM system. IEEE J Lightw Tech 20(12):1989–1996 9. Winzer PJ, Chandrasekhar S, Kim H (2003) Impact of filtering on RZ-DPSK reception. IEEE Photon Technol Lett 15(6):840–842 10. Suzuki S, Kawano Y, Nakasha Y (2005) A novel 50-Gbit/s NRZ-RZ converter with retiming function using Inp-HEMT technology. In: Presented at the Compound Semiconductor Integrated Circuit Symposium 11. Malekmohammadi A, Abdullah MK, Abas AF (2010) Performance, enhancement of AP-DCDM over WDM with dual drive Mach–Zehnder-Modulator in 1.28 Tbit/s optical fiber communication systems. Lecture notes in engineering and computer science: Proceedings of the World Congress on Engineering 2010, WCE 2010, 30 June–2 July, London, UK, pp 948–951

Chapter 9

Wi-Fi Wep Point-to-Point Links Performance Studies of IEEE 802.11a, b, g Laboratory Links J. A. R. Pacheco de Carvalho, H. Veiga, N. Marques, C. F. Ribeiro Pacheco and A. D. Reis

Abstract The importance of wireless communications has been growing. Performance is a crucial issue, resulting in more reliable and efficient communications. Security is equally important. Laboratory measurements are made about several performance aspects of Wi-Fi (IEEE 802.11a, b, g) WEP point-to-point links. A contribution is given to performance evaluation of this technology, using two types of access points from Enterasys Networks (RBT-4102 and RBTR2). Detailed results are presented and discussed, namely at OSI levels 4 and 7, from TCP, UDP and FTP experiments: TCP throughput, jitter, percentage datagram loss and FTP transfer rate. Comparisons are made to corresponding results obtained for open links. Conclusions are drawn about the comparative performance of the links.

J. A. R. P. de Carvalho (&) C. F. Ribeiro Pacheco A. D. Reis Unidade de Detecção Remota, Universidade da Beira Interior, 6201-001 Covilhã, Portugal e-mail: [email protected] C. F. Ribeiro Pacheco e-mail: [email protected] A. D. Reis e-mail: [email protected] H. Veiga N. Marques Centro de Informática, Universidade da Beira Interior, 6201-001 Covilhã, Portugal e-mail: [email protected] N. Marques e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_9, Ó Springer Science+Business Media B.V. 2011

105

106

J. A. R. P. de Carvalho et al.

9.1 Introduction Wireless communications are increasingly important for their versatility, mobility, speed and favourable prices. It is the case of microwave and laser based technologies, e.g. Wi-Fi (Wireless Fidelity) and FSO (Free Space Optics), respectively. The importance and utilization of Wi-Fi have been growing for complementing traditional wired networks. Wi-Fi has been used both in ad hoc mode, for communications in temporary situations e.g. meetings and conferences, and infrastructure mode. In this case, an AP (Access Point) is used to permit communications of Wi-Fi devices with a wired based LAN (Local Area Network) through a switch/router. In this way a WLAN, based on the AP, is formed which is known as a cell. A WPAN (Wireless Personal Area Network) arises in relation to a PAN (Personal Area Network). Point-to-point and point-to-multipoint configurations are used both indoors and outdoors, requiring specialized directional and omnidirectional antennas. Wi-Fi uses microwaves in the 2.4 and 5 GHz frequency bands and IEEE 802.11a, 802.11b and 802.11g standards [1]. Due to increasing used of 2.4 GHz band, interferences increase. Then, the 5 GHz band has received considerable interest, although absorption increases and ranges are shorter. Nominal transfer rates up to 11 (802.11b) and 54 Mbps (802.11a, g) are specified. CSMA/CA is the medium access control. Wireless communications, wave propagation [2, 3] and WLAN practical implementations [4] have been studied. Detailed information is available about the 802.11 architecture, including performance analysis of the effective transfer rate, where an optimum factor of 0.42 was presented for 11 Mbps point-to-point links [5]. Wi-Fi (802.11b) performance measurements are available for crowded indoor environments [6]. Performance has been a very important issue, giving more reliable and efficient communications. New telematic applications are specially sensitive to performances, when compared to traditional applications. Application characterization and requirements have been discussed e.g. for voice, Hi Fi audio, video on demand, moving images, HDTV images, virtual reality, interactive data, static images, intensive data, supercomputation, electronic mail, and file transfer [7]. E.g. requirements have been presented for video on demand/moving images (1–10 ms jitter and 1–10 Mbps throughputs) and for Hi Fi stereo audio (jitter less than 1 ms and 0.1–1 Mbps throughputs). Wi-Fi microwave radio signals can be easily captured by everyone. WEP (Wired Equivalent Privacy) was initially intended to provide confidentiality comparable to that of a traditional wired network. In spite of its weaknesses, WEP is still widely used in Wi-Fi communications for security reasons. A shared key for data encryption is involved. In WEP, the communicating devices use the same key to encrypt and decrypt radio signals. Several performance measurements have been made for 2.4 and 5 GHz Wi-Fi open links [8–10], as well as very high speed FSO [11]. In the present work further Wi-Fi (IEEE 802.11a, b, g) results arise, using WEP, through OSI levels 4 and 7.

9 Wi-Fi Wep Point-to-Point Links

107

Performance is evaluated in laboratory measurements of WEP point-to-point links using available equipments. Comparisons are made to corresponding results obtained for open links. Conclusions are drawn about the comparative performance of the links. The rest of the paper is structured as follows: Chap. 2 presents the experimental details i.e. the measurement setup and procedure. Results and discussion are presented in Chap. 3. Conclusions are drawn in Chap. 4.

9.2 Experimental Details Two types of experiments were carried out, which are referred as Exp-A and Exp-B. In the measurements of Exp-A we used Enterasys RoamAbout RBT-4102 level 2/3/4 access points (mentioned as AP-A), equipped with 16–20 dBm IEEE 802.11a/b/g transceivers and internal dual-band diversity antennas [12], and 100-Base-TX/10-Base-T Allied Telesis AT-8000S/16 level 2 switches [13]. The access points had transceivers based on the Atheros 5213A chipset, and firmware version 1.1.51. They were parameterized and monitored through both the console using CLI (Command Line Interface) and a HTTPS (Secure HTTP) incorporated server. The configuration was for minimum transmitted power and equivalent to point-to-point, LAN to LAN mode, using the internal antenna. For the measurements of Exp-B we used Enterasys RoamAbout RBTR2 level 2/3/4 access points (mentioned as AP-B), equipped with 15 dBm IEEE 802.11a/b/g cards [12], and 100-Base-TX/10-Base-T Allied Telesis AT-8000S/16 level 2 switches [13]. The access points had RBTBH-R2 W radio cards similar to the Agere-Systems model 0118 type, and firmware version 6.08.03. They were parameterized and monitored through both the console and the RoamAbout AP Manager software. The configuration was for minimum transmitted power i.e. micro cell, point-to-point, LAN to LAN mode, using the antenna which was built in the card. Interference free channels were used in the communications. This was checked through a portable computer, equipped with a Wi-Fi 802.11a/b/g adapter running NetStumbler software [14]. WEP encryption was activated, using 128 bit encryption and a shared key for data encryption composed of 13 ASCII characters. No power levels above the minimum were required, as the access points were very close. Both types of experiments, Exp-A and Exp-B, were made using a laboratory setup, which has been planned and implemented as shown in Fig. 9.1. At OSI level 4 measurements were made for TCP connections and UDP communications, using Iperf software [15], permitting network performance results to be recorded. For a TCP connection, TCP throughput was obtained. For a UDP communication with a given bandwidth parameter, UDP throughput, jitter and percentage loss of datagrams were obtained. TCP packets and UDP datagrams of 1470 bytes size were used. A window size of 8 kbytes and a buffer size of the same value were used for TCP and UDP, respectively. In Fig. 9.1, one PC having

108

J. A. R. P. de Carvalho et al.

Fig. 9.1 Experimental laboratory setup scheme

IP 192.168.0.2 was the Iperf server and the other, with IP 192.168.0.6, was the Iperf client. Jitter, which represents the smooth mean of differences between consecutive transit times, was continuously computed by the server, as specified by RTP (Real Time Protocol) in RFC 1889 [16]. The same scheme was used for FTP measurements, where FTP server and client applications were installed in the PCs with IPs 192.168.0.2 and 192.168.0.6, respectively. The PCs were portable computers running Windows XP. They were set up to make available maximum resources to the present work. Also, batch command files were written to enable the TCP, UDP and FTP tests. The results were obtained in batch mode and written as data files to the client PC disk. Each PC had a second network adapter, to permit remote control from the official IP Remote Detection Unit network, via switch.

9.3 Results and Discussion Both access points AP-A and AP-B were configured with various fixed transfer rates for every one of the standards IEEE 802.11b (1, 2, 5.5 and 11 Mbps), 802.11g and 802.11a (6, 9, 12, 18, 24, 36, 48 and 54 Mbps). At OSI level 1 (physical layer), for every one of the cases, the local and remote values of the signal to noise ratios SNR were recorded. The best SNR levels were observed for 802.11g and 802.11a. Performance measurements, using TCP connections and UDP communications at OSI level 4 (transport layer), were carried out for both Exp-A and Exp-B. In each experiment, for every standard and nominal fixed transfer rate, an average TCP throughput was determined from several experiments. This value was used as the bandwidth parameter for every corresponding UDP test, giving average jitter and average percentage datagram loss. The results are shown in Figs. 9.2, 9.3 and 9.4.

9 Wi-Fi Wep Point-to-Point Links

109

Fig. 9.2 TCP throughput results versus technology and nominal transfer rate; Exp-A and Exp-B

Figure 9.2 shows the results from Exp-A and Exp-B, where polynomial fits were made to the TCP throughput data for each AP implementation of IEEE 802.11a, b, g, where R2 is the coefficient of determination. It follows that the best TCP throughputs are, by descending order, for 802.11a, 802.11g and 802.11b. In Exp-A (Fig. 9.2), the data for 802.11a are on average 32.6% higher than for 802.11g. The average values are 13.10 ± 0.39 Mbps for 802.11a, and 9.62 ± 0.29 Mbps for 802.11g. These values are in good agreement with those obtained for the same AP type and open links (13.19 ± 0.40 Mbps and 9.97 ± 0.30 Mbps for 802.11a and 802.11g, respectively) [9]. For 802.11b, the average value is 2.55 ± 0.08 Mbps. Also, the 802.11b data for 5.5 and 11 Mbps (average 4.05 ± 0.12 Mbps) are in good agreement with those obtained for the same AP type and open links (4.08 ± 0.12) [9]. In Exp-B (Fig. 9.2), the data for 802.11a are on average 2.9% higher than for IEEE 802.11g. The average values are 12.97 ± 0.39 Mbps for 802.11a, and 12.61 ± 0.38 Mbps for 802.11g. These values are in good agreement with those obtained for the same AP type and open links (12.92 ± 0.39 Mbps and 12.60 ± 0.38 Mbps for 802.11a and 802.11g, respectively) [9]. For 802.11b, the average value is 2.42 ± 0.07 Mbps.

110

J. A. R. P. de Carvalho et al.

Fig. 9.3 UDP—jitter results versus technology and nominal transfer rate; Exp-A and Exp-B

Also, the 802.11b data for 5.5 and 11 Mbps (average 3.88 ± 0.12 Mbps) are in good agreement with those obtained for the same AP type and open links (3.84 ± 0.12) [9]. The best TCP throughput performance was for AP-B. For both Exp-A and Exp-B, in Figs. 9.3 and 9.4 the data points representing jitter and percentage datagram loss, respectively, were joined by smoothed lines. In Exp-A (Fig. 9.3) the jitter data are on average lower for 802.11a (1.9 ± 0.1 ms) than for 802.11g (2.6 ± 0.1 ms). Similar trends were observed for the same AP type and open links (1.3 ± 0.1 ms for 802.11a and 2.4 ± 0.1 ms for 802.11g) [9]. For 802.11b, the average value is 4.8 ± 0.3 ms. Also, the 802.11b data for 5.5 and 11 Mbps (average 5.6 ± 0.9 ms) are higher than those respecting the same AP type and open links (3.7 ± 0.5 ms) [9]. In Exp-B (Fig. 9.3), the jitter data (1.8 ± 0.1 ms on average) show fair agreement for IEEE 802.11a and 802.11g. Similar trends were observed for the same AP type and open links (1.9 ± 0.1 ms on average) [9]. For 802.11b the average value is 1.6 ± 0.1 ms. Also, the 802.11b data for 5.5 and 11 Mbps (average 2.5 ± 0.5 ms) are in good agreement with those respecting the same AP type and open links (2.6 ± 0.2 ms) [9]. The best jitter performance was for AP-B.

9 Wi-Fi Wep Point-to-Point Links

111

Fig. 9.4 UDP—percentage datagram loss results versus technology and nominal transfer rate; Exp-A and Exp-B

In both Exp-A and Exp-B (Fig. 9.4), generally, the percentage datagram loss data agree rather well for all standards. They are on average 1.2 ± 0.1%. This is in good agreement with the results for the same AP types and open links (on average, 1.3 ± 0.2% for AP-A and 1.2 ± 0.2% for AP-B) [9]. AP-A and AP-B have shown similar percentage datagram loss performances. At OSI level 7 (application layer), FTP transfer rates were measured versus nominal transfer rates configured in the APs for the IEEE 802.11a, b, g, standards. Every measurement was the average for a single FTP transfer, using a binary file size of 100 Mbytes. The results from Exp-A and Exp-B are represented in Fig. 9.5. Polynomial fits to data were made for the implementation of every standard. It was found that in both cases the best performances were, by descending order, for 802.11a, 802.11g and 802.11b: the same trends found for TCP throughput. The FTP transfer rates obtained in Exp-A, using IEEE 802.11b, were close to those in Exp-B. The FTP performances obtained for Exp-A and IEEE 802.11a were only slightly better in comparison with Exp-B. On the contrary, for Exp-A and IEEE 802.11g, FTP performances were significantly worse than in Exp-B, suggesting that AP-B had a better FTP performance than AP-A for IEEE 802.11g. Similar trends had been observed for corresponding open links [9].

112

J. A. R. P. de Carvalho et al.

Fig. 9.5 FTP transfer rate results versus technology and nominal transfer rate; Exp-A and Exp-B

Generally, the results measured for the WEP links agree reasonably well, within the experimental errors, with corresponding data obtained for open links.

9.4 Conclusions In the present work a laboratory setup arrangement was planned and implemented, that permitted systematic performance measurements of available access point equipments (RBT-4102 and RBTR2 from Enterasys) for Wi-Fi (IEEE 802.11a, b, g) in WEP point-to-point links. Through OSI layer 4, TCP throughput, jitter and percentage datagram loss were measured and compared for each standard. The best TCP throughputs were found by descending order for 802.11a, 802.11g and 802.11b. TCP throughputs were also

9 Wi-Fi Wep Point-to-Point Links

113

found sensitive to AP type. Similar trends were observed for the same AP types and open links. The lower average jitter values were found for IEEE 802.11a, and 802.11g. Some sensitivity to AP type was observed. For the percentage datagram loss, a reasonably good agreement was found, on average, for all standards and AP types. Similar trends were observed for the same AP types and open links. At OSI layer 7, the measurements of FTP transfer rates have shown that the best FTP performances were by descending order for 802.11a, 802.11g and 802.11b. This result shows the same trends found for TCP throughput. Similar trends were observed for the same AP types and open links. FTP performances were also found sensitive to AP type. Generally, the results measured for WEP links agree reasonably well, within the experimental errors, with corresponding data obtained for open links. Additional performance measurements either started or are planned using several equipments, not only in laboratory but also in outdoor environments involving, mainly, medium range links. Acknowledgments Supports from University of Beira Interior and FCT (Fundação para a Ciência e a Tecnologia)/POCI2010 (Programa Operacional Ciência e Inovação) are acknowledged. We acknowledge Enterasys Networks for their availability.

References 1. IEEE Std 802.11-2007, IEEE standard for local and metropolitan area networks-specific requirements-part 11: wireless LAN medium access control (MAC) and physical layer (PHY) specifications (10 October 2007); http://standards.ieee.org/getieee802 2. Mark JW, Zhuang W (2003) Wireless communications and networking. Prentice-Hall Inc, Upper Saddle River 3. Rappaport TS (2002) Wireless communications principles and practice, 2nd edn. PrenticeHall Inc, Upper Saddle River 4. Bruce WR III, Gilster R (2002) Wireless LANs end to end. Hungry Minds Inc, New York 5. Schwartz M (2005) Mobile wireless communications. Cambridge University Press, Cambridge 6. Sarkar NI, Sowerby KW (2006) High performance measurements in the crowded office environment: a case study. In: Proceedings ICCT’06-International Conference on Communication Technology, Guilin, China, 27–30 November, pp 1–4 7. Monteiro E, Boavida F (2002) Engineering of informatics networks, 4th edn. FCA-Editor of Informatics Ld, Lisbon 8. Pacheco de Carvalho JAR, Gomes PAJ, Veiga H, Reis AD (2008) Development of a university networking project. In: Putnik GD, Manuela Cunha M (eds) Encyclopedia of Networked and Virtual Organizations. IGI Global, Hershey, pp 409–422 9. Pacheco JAR, de Carvalho H, Veiga PAJ, Gomes CF, Ribeiro Pacheco N, Marques AD, Reis (2010) Wi-Fi Point-to-point Links—Performance Aspects of IEEE 802.11a, b, g Laboratory Links. In: Ao S-I, Gelman L (eds) Electronic Engineering and Computing Technology, Series: Lecture Notes in Electrical Engineering, vol 60. Springer, Netherlands, pp 507–514 10. Pacheco de Carvalho JAR, Veiga H, Marques N, Ribeiro Pacheco CF, Reis AD (2010) Laboratory performance of Wi-Fi WEP point-to-point links: a case study. Lecture notes in

114

11.

12. 13. 14. 15. 16.

J. A. R. P. de Carvalho et al. engineering and computer science: Proceedings of the World Congress on Engineering 2010, WCE 2010, vol I. 30 June–2 July, London, UK, pp 764–767 Pacheco de Carvalho JAR, Veiga H, Gomes PAJ, Cláudia F, Ribeiro Pacheco FP, Reis AD (2008) Experimental performance study of very high speed free space optics link at the university of Beira interior campus: a case study. In: Proceedings ISSPIT 2008-8th IEEE International Symposium on Signal Processing and Information Technology, Sarajevo, Bosnia and Herzegovina, December 16–19, pp 154–157 Enterasys Networks, Roam About R2, RBT-4102 Wireless Access Points (20 December 2008). http://www.enterasys.com Allied Telesis, AT-8000S/16 Layer 2 Managed Fast Ethernet Switch (20 December 2008). http://www.alliedtelesis.com NetStumbler software. http://www.netstumbler.com Iperf software, NLANR. http://dast.nlanr.net Network Working Group, RFC 1889-RTP: A Transport Protocol for Real Time Applications. http://www.rfc-archive.org

Chapter 10

Interaction Between the Mobile Phone and Human Head of Various Sizes Adel Zein El Dein Mohammed Moussa and Aladdein Amro

Abstract This chapter analyzes the specific absorption rate (SAR) induced in human head model of various sizes by a mobile phone at 900 and 1800 MHz. Specifically the study is considering in SAR between adults and children. Moreover, these differences are assessed for compliance with international safety guidelines. Also the effects of these head models on the most important terms for a mobile terminal antenna designer, namely: radiation efficiency, total efficiency and directivity, are investigated.

10.1 Introduction In recent years, much attention has been paid to health implication of electromagnetic (EM) waves, especially human head part, which is exposed to the EM fields radiated from handsets. With the recent explosive increase of the use of mobile communication handsets, especially the number of children using a mobile phone, that develops many questions about the nature and degree of absorption of EM waves by this category of public as a function of their age and their morphology. For this reason the World Health Organization (WHO) has recommended

A. Z. E. D. M. Moussa (&) Department of Electrical Engineering, High Institute of Energy, South Valley University, Aswan, 81258, Egypt e-mail: [email protected] A. Amro Department of Communications Engineering faculty of Engineering, Al-Hussein Bin Talal University, P.O. Box 20, Ma’an, Jordan e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_10, Ó Springer Science+Business Media B.V. 2011

115

116

A. Z. E. D. M. Moussa and A. Amro

100%

95%

90%

85%

80%

Fig. 10.1 Description of the various sizes of human head models

to undertake research studies on this subject [1–3]. This chapter investigates the effects of head models of various sizes on the most important terms for a mobile terminal antenna designer, namely: radiation efficiency, total efficiency and directivity; and also on the Specific Absorption Rates (SAR) which are induced in them. For this purpose, a comparison is performed concerning those parameters between an adult human head and some children heads obtained as a percent of an adult human head. The results are obtained using an electromagnetic field solver employing the Integral Equations method [4]. The SAR is the most appropriate metric for determining EM effect exposure in the very near field of a Radio Frequency (RF) source [5–9]. The local SAR (W/kg) at any point in the human head is defined as: SAR ¼

rE2 2q

ð10:1Þ

Where E is the peak amplitude of the electrical field in the human head tissue (V/m), r is the tissue conductivity (S/m) and q is the tissue density (kg/m3). The SAR over a mass of 10 and 1 g in the head and the other parameters of the mobile antenna are determined in each case.

10.2 Modeling of Human Head For this study, five head models are used namely: that of an adult and other children human heads of sizes; 95, 90, 85, and 80% of the adult head size (which of size 100%), as they are shown in Fig. 10.1. Each head model consists of shell of skin tissue which is filled with a liquid of brain properties. For simulation of the EM fields in the human head, the appropriate parameters for the conductivity r (S/m), the relative permittivity er and the tissue density q (kg/m3) of all different materials used for the calculation must be known. Additionally, the frequency dependence of these parameters must be considered and chosen appropriately. A recent compilation of Gabriel et al. covers a wide range of different body tissues

10

Interaction Between the Mobile Phone and Human Head of Various Sizes

117

Table 10.1 Dielectric permittivity er , conductivity r (S/m), and mass density q (kg/m3) of tissues used in the simulations at 900 and 1800 MHz Properties of tissues Dielectric Conductivity Mass density q(kg/m3) permittivity er r (S/m) Shell (skin) Liquid (brain)

900 MHz 1800 MHz 900 MHz 1800 MHz

43.8 38.87 45.8 43.5

0.86 1.19 0.77 1.15

1000 1000 1030 1030

Table 10.2 Volume and mass of the heads’ models The volume and mass of the human head Human head size as a percent of an adult one (%) Tissue volume (mm3)*106 Tissue mass (kg)

100

95

90

85

80

5.5893 5.7439

4.7886 4.9236

4.0706 4.1855

3.4283 3.5250

2.8573 2.9379

and offers equations to determine the appropriate dielectric values at each desired frequency [10, 11]. Table 10.1 shows the real part of the dielectric permittivity er, conductivity r (S/m), and mass density q (kg/m3) of tissues used in the simulations at 900 and 1800 MHz. Table 10.2 shows the volume and the mass of the tissue of all children heads.

10.3 Modeling of the Mobile Phone The mobile handset consists of a quarter-wavelength monopole (of radius 0.0025 m at 900 MHz and 0.001 m at 1800 MHz) mounted on a mobile handset (treated as a metal box of 1.8 9 4 9 10 cm), operates at 900 and 1800 MHz and radiated power of 0.125 W, as it is shown in Fig. 10.2.

10.4 Results and Discussion Figures 10.3 and 10.4 present mobile terminal antenna designer parameters namely: return loss, radiation efficiency, total efficiency and directivity, the results obtained with the absence of the human head and at a frequency 1800 MHz. Table 10.3 present the Mobile Antenna Parameters, namely: radiation efficiency, total efficiency and directivity, the results obtained for various sizes of human heads and for the case of there absence. It is seen that as the size of human head decreases the radiation efficiency and total efficiency decrease, in the other side the directivity increases. The differences between the results of SAR of different kinds are given in Table 10.4 for each frequency and for each studied child head model.

118

A. Z. E. D. M. Moussa and A. Amro

Fig. 10.2 Description of the mobile handset

Fig. 10.3 Return loss without human head

The ‘‘SAR 10 grams’’ is the maximum SAR value averaged on 10 g which is obtained by averaging the SAR around each point in the volume and adding the nearest points till an average mass of 10 g is reached with a resulting volume having the shape of a portion of sphere. The ‘‘contiguous SAR 1 gram’’ is estimated by averaging the local maximum SAR, adding the highest SAR volume in a given tissue till a mass of 1 g is reached. The SAR (point) is the local value of SAR at every point inside the head model. The results show that by decreasing the head size the peak SAR 1 g and peak SAR 10 g decrease, however the percentage of absorbed power in the human head increases. So, the local SAR (point) and total SAR in children’s heads increase as children’s heads decrease, as indicated in Table 10.3. Also from Table 10.3 it is noticed that, the total SAR over the whole human head at 1800 MHz is less than that at 900 MHz. This is because the SAR regions produced by monopole antenna at 900 MHz are more extended as

10

Interaction Between the Mobile Phone and Human Head of Various Sizes

119

Fig. 10.4 Far field without human head

Table 10.3 Mobile antenna parameters with various sizes of human head Mobile antenna parameters Without Human head size as a percent from an adult human one (%) head 100 95 90 85 80 900 MHz

1800 MHz

Rad. Tot. Dir. Rad. Tot. Dir.

g g (dBi) g g (dBi)

1.003 0.788 2.627 1.003 1.002 3.653

0.276 0.271 6.066 0.485 0.476 7.98

0.292 0.286 5.943 0.498 0.489 7.855

0.311 0.305 5.859 0.512 0.504 7.756

0.335 0.328 5.819 0.528 0.521 7.673

0.357 0.35 5.712 0.546 0.539 7.552

compared to those induced at 1800 MHz. The human body works as a barrier, mainly in high frequencies, because of skin depth. As the frequency increases the penetration capacity decreases and become more susceptible to obstacles. Figures 10.5, 10.6, 10.7, 10.8, 10.9, 10.10 show the distributions of the local SAR, at the y = 0 plane; 10 g SAR in xz plane; and 1 g SAR in xy plane; in (W/kg), on the human head of various sizes, obtained with a radiated power of

120

A. Z. E. D. M. Moussa and A. Amro

Table 10.4 SAR induced in children’s heads Calculated parameters of the human head Human head size as a percent from an adult one (%) 900 MHz

1800 MHz

SAR (point) SAR 1 g SAR 10 g Absorbed power (wrms) Total SAR (W/kg) SAR (point) SAR 1 g SAR 10 g Absorbed power (wrms) Total SAR (W/kg)

100

95

90

85

80

1.134 0.818 0.593 0.089 0.016 4.149 1.590 0.922 0.064 0.011

1.206 0.805 0.59 0.087 0.018 3.078 1.530 0.887 0.062 0.012

1.124 0.785 0.584 0.085 0.02 2.404 1.482 0.848 0.060 0.014

1.122 0.769 0.58 0.082 0.023 2.319 1.399 0.805 0.058 0.016

1.214 0.769 0.572 0.079 0.027 2.282 1.312 0.764 0.056 0.019

Fig. 10.5 Distributions of the local SAR at x = 0 plane for 1800 MHz

125 mW from a monopole antenna operates at 900 and 1800 MHz respectively. It can be easily noticed that high SAR regions produced by 900 MHz monopole antenna are more extended as compared to those induced by 1800 MHz monopole antenna, as it is explained before.

10.5 Conclusion The obtained results show that the spatial-peak SAR values at a point or as averaged over 1 and 10 g on the human head of various sizes, obtained with a radiated power of 125 mW from a monopole antenna operates at 900 and

10

Interaction Between the Mobile Phone and Human Head of Various Sizes

121

Fig. 10.6 Distributions of the (10 g) SAR at xz plane for 1800 MHz

Fig. 10.7 The distributions of the (1 g) SAR at xy plane for 1800 MHz

1800 MHz, vary with the size of the human’s head at each frequency. Also the sizes of the head have an effect on the mobile terminal antenna designer parameters, and this effect can’t be eliminated, because it is an electromagnetic

122

A. Z. E. D. M. Moussa and A. Amro

Fig. 10.8 Distributions of the local SAR at x = 0 plane for 900 MHz

Fig. 10.9 Distributions of the (10 g) SAR at xz plane for 900 MHz

10

Interaction Between the Mobile Phone and Human Head of Various Sizes

123

Fig. 10.10 Distributions of the (1 g) SAR at xy plane for 900 MHz

characteristic. The obtained results show that the spatial-peak SAR values as averaged over 1 g on the human head obtained with a radiated power of 0.125 W for all simulations are well below the limit of 1.6 W/kg, which is recommended by FCC and ICNIRP [12–14].

References 1. International Association of Engineers [Online]. Available: http://www.iaeng.org 2. El Dein AZ, Amr A (2010) Specific absorption rate (SAR) induced in human heads of various sizes when using a mobile phone at 900 and 1800 MHz. Lecture notes in engineering and computer science: Proceeding of the World Congress on Engineering 2010, Vol I, WCE 2010, 30 June–2 July, London, UK, pp 759–763 3. Kitchen R (2001) RF and microwave radiation safety handbook, Chapter 3, 2nd edn. Newnes, Oxford, pp 47–85 4. CST Microwave studio site. Available: http://www.cst.com/ 5. Kiminami K, Iyama T, Onishi T, Uebayashi S (2008) Novel specific absorption rate (SAR) estimation method based on 2-D scanned electric fields. IEEE Trans Electromagn Compat 50(4):828–836 6. Watanabe S, Taki M, Nojima T, Fujiwara O (1996) Characteristics of the SAR distributions in a head exposed to electromagnetic fields radiated by a hand-held portable radio. IEEE Trans Microwave Theory Tech 44(10):1874–1883 7. Hadjem A, Lautru D, Dale C, Wong MF, Fouad-Hanna V, Wiart J (2004) Comparison of specific absorption rate (SAR) induced in child-sized and adult heads using a dual band mobile phone. Proceeding on IEEE MTT-S Int. Microwave Symposium Digest, June 2004

124

A. Z. E. D. M. Moussa and A. Amro

8. Kivekäs O, Ollikainen J, Lehtiniemi T, Vainikainen P (2004) Bandwidth, SAR, and efficiency of internal mobile phone antennas. IEEE Trans Electromagn Compat 46(1):71–86 9. Beard BB et al (2006) Comparisons of computed mobile phone induced SAR in the SAM phantom to that in anatomically correct models of the human head. IEEE Trans Electromagn Compat 48(2):397–407 10. Gabriel C (1996) Compilation of the Dielectric Properties of Body Tissues at RF and Microwave Frequencies.‘‘Brooks Air’’ Force Technical Report AL/OE-TR-1996-0037 [Online]. Available: http://www.fcc.gov/cgi-bin/dielec.sh 11. El Dein AZ (2010) Interaction between the human body and the mobile phone. Book Published by LAP Lambert Academic, ISBN 978-3-8433-5186-7 12. FCC, OET Bulletin 65, Evaluating Compliance with FCC Guidelines for Human Exposure to Radiofrequency Electromagnetic Fields. Edition 97-01, released December, 1997 13. IEEE C95.1-1991 (1992) IEEE standard for safety levels with respect to human exposure to radio frequency electromagnetic fields, 3 kHz to 300 GHz. Institute of Electrical and Electronics Engineers, Inc., New York 14. European Committee for Electrotechnical Standardization (CENELEC) (1995) Prestandard ENV 501 66-2, Human exposure to electromagnetic fields. High frequency (10 kHz to 300 GHz)

Chapter 11

A Medium Range Gbps FSO Link Extended Field Performance Measurements J. A. R. Pacheco de Carvalho, N. Marques, H. Veiga, C. F. Ribeiro Pacheco and A. D. Reis

Abstract Wireless communications have been increasingly important. Besides Wi-Fi, FSO plays a very relevant technological role in this context. Performance is essential, resulting in more reliable and efficient communications. A 1.14 km FSO medium range link has been successfully implemented for high requirement applications at Gbps. An extended experimental performance evaluation of this link has been carried out at OSI levels 1, 4 and 7, through a specifically planned field test arrangement. Several results, obtained namely from simultaneous measurements of powers received by the laser heads for TCP, UDP and FTP experiments, are presented and discussed.

J. A. R. P. de Carvalho (&) C. F. Ribeiro Pacheco A. D. Reis Unidade de Detecção Remota, Universidade da Beira Interior, 6201-001, Covilhã, Portugal e-mail: [email protected] C. F. Ribeiro Pacheco e-mail: [email protected] A. D. Reis e-mail: [email protected] N. Marques H. Veiga Centro de Informática, Universidade da Beira Interior, 6201-001, Covilhã, Portugal e-mail: [email protected] H. Veiga e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_11, Ó Springer Science+Business Media B.V. 2011

125

126

J. A. R. P. de Carvalho et al.

11.1 Introduction Wi-Fi and FSO are wireless communications technologies whose importance and utilization have been growing for their versatility, mobility, speed and favourable prices. Wi-Fi uses microwaves in the 2.4 and 5 GHz frequency bands and IEEE 802, 11a, b, g standards. Nominal transfer rates up to 11 (802.11b) and 54 Mbps (802.11a, g) are specified [1]. It has been used in ad hoc and infrastructure modes. Point-to-point and point-to-multipoint configurations are used both indoors and outdoors, requiring specific directional and omnidirectional antennas. FSO uses laser technology to provide point-to-point communications e.g. to interconnect LANs of two buildings having line-of-sight. FSO was developed in the 1960s for military and other purposes, including high requirement applications. At present, speeds typically up to 2.5 Gbps are possible and ranges up to a few km, depending on technology and atmospheric conditions. Interfaces such as fast Ethernet and Gigabit Ethernet are used to communicate with LAN’s. Typical laser wavelengths of 785, 850 and 1550 nm are used. In a FSO link the transmitters deliver high power light which, after travelling through atmosphere, appears as low power light at the receiver. The link margin of the connection represents the amount of light received by a terminal over the minimum value required to keep the link active: (link margin) dB = 10 log10 (P/Pmin), where P and Pmin are the corresponding power values, respectively. There are several factors related to performance degradation in the design of a FSO link: distance between optical emitters; line of sight; alignment of optical emitters; stability of the mounting points; atmospheric conditions; water vapour or hot air; strong electromagnetic interference; wavelength of the laser light [2]. A redundant microwave link is always essential, as the laser link can fail under adverse conditions and communications are interrupted. Several studies and implementations of FSO have been reported [3, 4]. FSO has been used in hybrid systems for temporary multimedia applications [5]. Performance has been a very important issue, resulting in more reliable and efficient communications. Telematic applications have specific performance requirements, depending on application. New telematic applications present special sensitivities to performances, when compared to traditional applications. E.g. requirements have been quoted as: for video on demand/moving images, 1–10 ms jitter and 1–10 Mbps throughput; for Hi Fi stereo audio, jitter less than 1 ms and 0.1–1 Mbps throughputs [6]. Several performance measurements have been made for Wi-Fi [7, 8]. FSO and fiber optics have been applied at the University of Beira Interior Campus, at Covilhã City, Portugal, to improve communications quality [9–12]. In the present work we have further investigated that FSO link, for extended performance evaluation at OSI levels 1, 4 and 7. The rest of the paper is structured as follows: Chap. 2 presents the experimental details i.e. the measurement setup and procedure. Results and discussion are presented in Chap. 3. Conclusions are drawn in Chap. 4.

11

A Medium Range Gbps FSO Link

127

Fig. 11.1 View of the 1.14 km laser link between Pole II (SB) and Pole III (FM)

11.2 Experimental Details The main experimental details, for testing the quality of the FSO link, are as follows. A 1 Gbps full-duplex link was planned and implemented, to interconnect the LAN at the Faculty of Medicine building and the main University network, to support medical imaging, VoIP, audio and video traffics [9, 10]. Then, a FSO laser link at 1 Gbps full-duplex, over a distance of 1.14 km, was created to interconnect the Faculty of Medicine (FM) building at Pole III and the Sports (SB) building at Pole II of the University (Fig. 11.1). We have chosen laser heads from FSONA (Fig. 11.2) to implement the laser link at a laser wavelength of k = 1550 nm for eye safety, where allowable laser power is about fifty times higher at 1550 nm than at 800 nm [2–13]. Each laser head comprised two independent transmitters, for redundancy, and one wide aperture receiver. Each laser had 140 mW of power, resulting in an output power of 280 mW (24.5 dBm). 1000-Base-LX links over OM3 50/125 lm fiber were used to connect the laser heads to the LANs. For a matter of redundancy a 802.16d WiMAX point-to-point link at 5.4 GHz was available, where data rates up to either 75 Mbps or 108 Mbps were possible in normal mode or in turbo mode, respectively [14]. This link was used as a backup link for FM-SB communications, through configuration of two static routing entries in the switching/routing equipment [9].

128

J. A. R. P. de Carvalho et al.

Fig. 11.2 View of the laser heads at FM (Pole III) and SB (Pole II)

Performance tests of the FSO link were made under favourable weather conditions. During the tests we used a data rate mode for the laser heads which was compatible with Gigabit Ethernet. At OSI level 1 (physical layer), received powers were simultaneously measured for both laser heads. Data were collected. from the internal logs of the laser heads, using STC (SONAbeam Terminal Controller) management software [13]. At OSI level 4 (transport layer), measurements were made for TCP connections and UDP communications using Iperf software [15], permitting network performance results to be recorded. Both TCP and UDP are transport protocols. TCP is connection-oriented. UDP is connectionless, as it sends data without ever establishing a connection. For a TCP connection over a link, TCP throughput was obtained. For a UDP communication, we obtained UDP throughput, jitter and percentage loss of datagrams. TCP packets and UDP datagrams of 1470 bytes size were used. A window size of 8 kbytes and a buffer size of the same value were used for TCP and UDP, respectively. A specific field test arrangement was planned and implemented for the measurements (Fig. 11.3). Two PC’s having IP addresses 192.168.0.2 and 192.168.0.1 were setup as the Iperf server and client, respectively. The PCs were HP computers, with 3.0 GHz Pentium IV CPUs, running Windows XP. The server had a better RAM configuration than the client. They were both equipped with 1000Base-T network adapters. Each PC was connected via 1000Base-T to a C2 Enterasys switch [16]. Each switch had a 1000Base-LX interface. Each interface was intended to establish a FSO link through two laser heads, as represented in Fig. 11.3. The laser heads were located at Pole II and Pole III, at the SB and FM buildings, respectively. The experimental arrangement could be remotely accessed through the FM LAN. In the UDP tests a bandwidth parameter of 300 Mbps was used in the Iperf client. Jitter, which represents the smooth mean of differences between consecutive transit times, was continuously computed by the server, as specified by RTP in RFC 1889 [17]. RTP provides end-to-end network transport functions appropriate for applications

11

A Medium Range Gbps FSO Link

129

Fig. 11.3 Field tests setup scheme for the FSO link

transmitting real-time data, e.g. audio, video, over multicast or unicast network services. At OSI level 7 (application layer) the setup given in Fig. 11.3 was also used for measurements of FTP transfer rates through FTP server and client applications installed in the PCs. Each measurement corresponded to a single FTP transfer, using a 2.71 Gbyte file. Whenever a measurement was made at either OSI level 4 or 7, data were simultaneously collected at OSI level 1. Batch command files were written to enable the TCP, UDP and FTP tests. The results, obtained in batch mode, were recorded as data files in the client PC disk.

11.3 Results and Discussion Several sets of data were collected and processed. The TCP, UDP and FTP experiments were not simultaneous. The corresponding results are shown for TCP in Fig. 11.4, for UDP in Fig. 11.6 and FTP in Fig. 11.8. The average received powers for the SB and FM laser heads, mostly ranged high values in the 25–35 lW interval which corresponds to link margins of 4.9–6.4 dB (considering Pmin = 8 lW). From Fig. 11.4 it follows that TCP average throughput (314 Mbps) is very steady; some small peaks arise for throughput deviation. Figure 11.5 illustrates details of TCP results over a small interval. Figure 11.6 shows that UDP average throughput (125 Mbps) is fairly steady, having a small steady throughput deviation. The jitter is small, usually less than 1 ms, while percentage datagram loss is practically negligible. Figure 11.7 illustrates details of UDP-jitter results over a small interval. Figure 11.8 shows that average FTP throughput (344 Mbps) is very steady, having low throughput deviation.

130

J. A. R. P. de Carvalho et al.

Fig. 11.4 TCP results

Fig. 11.5 Details of TCP results

Figure 11.9 illustrates details of FTP results over a small interval. Transfer rates of the PC’s disks are always a limitation in this type of FTP experiments. In all cases, high values of average received powers were observed. The quantities under analysis did not show on average significant variations even when the received powers varied. The results here obtained complement previous work by the authors [9–12]. Generally, for our experimental conditions, the FSO link has exhibited very good performances at OSI levels 4 and 7. Besides the present results, it must be mentioned that we have implemented a VoIP solution based on Cisco Call Manager [18]. VoIP, with G.711 and G729A coding algorithms, has been working over the laser link without any performance problems. Tools such as Cisco IP Communicator have been used. Video and sound have also been tested through the laser link, by using eyeBeam Softphone CounterPath software [19]. Applications using the link have been well-behaved.

11

A Medium Range Gbps FSO Link

131

Fig. 11.6 UDP results; 300 Mbps bandwidth parameter

Fig. 11.7 Details of UDP-jitter results; 300 Mbps bandwidth parameter

11.4 Conclusions A FSO laser link at 1 Gbps has been successfully implemented over 1.14 km along the city, for interconnecting Poles of the University and support high requirement applications. A field test arrangement has been planned and implemented, permitting extended performance measurements of the FSO link at OSI levels 1, 4 and 7. At OSI level 1, received powers were simultaneously measured in both laser heads.

132

J. A. R. P. de Carvalho et al.

Fig. 11.8 FTP results

Fig. 11.9 Details of FTP results

At OSI level 4, TCP throughput, jitter and percentage datagram loss were measured. Through OSI level 7, FTP transfer rate data were acquired. Under favourable weather conditions, when the measurements were carried out, the link has behaved very well, giving very good performances. Applications such as VoIP, video and sound, have been well-behaved. Further measurements are planned under several experimental conditions. Acknowledgments Supports from University of Beira Interior and FCT (Fundação para a Ciência e a Tecnologia)/POCI2010 (Programa Operacional Ciência e Inovação) are acknowledged. We acknowledge Hewlett Packard and FSONA for their availability.

11

A Medium Range Gbps FSO Link

133

References 1. IEEE Std 802.11-2007 (2007) IEEE Standard for Local and metropolitan area networksSpecific Requirements-Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) specifications (October 10, 2007); http://standards.ieee.org/getieee802 2. Rockwell DA, Mecherle GS (2001) Wavelength selection for optical wireless communication systems. Proc SPIE 4530:27–35 3. Amico MD, Leva A, Micheli B (2003) Free space optics communication systems: first results from a pilot field trial in the surrounding area of Milan Italy. IEEE Microwave Wirel Compon Lett 13(8):305–307 August 4. Löschnigg M, Mandl P, Leitgeb E (2009) Long-term performance observation of a free space optics link. In: Proceedings of the 10th International Conference on TelecommunicationsContel, Zagreb, Croatia, June 8–10, pp 305–310 5. Mandl P, Chlestil Ch, Zettl K, Leitgeb E (2007) Hybrid systems using optical wireless, fiber optics and WLAN for temporary multimedia applications. In: Proceedings of the 9th International Conference on Telecommunications-Contel, Zagreb, Croatia, June 13–15, pp 73–76 6. Monteiro E, Boavida F (2002) Engineering of informatics networks, 4th edn. FCA-Editor of Informatics Ld, Lisbon 7. Pacheco de Carvalho JAR, Gomes PAJ, Veiga H, Reis AD (2008) Development of a university networking project. In: Putnik GD, Manuela Cunha M (eds) Encyclopedia of Networked and Virtual Organizations. IGI Global, Hershey, pp 409–422 8. Pacheco de Carvalho JAR, Veiga H, Marques N, Ribeiro Pacheco CF, Reis AD (2010) Laboratory performance of Wi-Fi WEP point-to-point links: a case study. Lecture notes in engineering and computer science: Proceedings of The World Congress on Engineering, WCE 2010, vol I, London, UK, 30 June–2 July, pp 764–767 9. Pacheco de Carvalho JAR, Gomes PAJ, Veiga H, Reis AD (2007) Wi-Fi and very high speed optical links for data and voice communications. In: Proc. 2a Conferência Ibérica de Sistemas e Tecnologias de Informação, Universidade Fernando Pessoa, Porto, Portugal, 21–23 June, pp 441–452 10. Pacheco de Carvalho JAR, Veiga H, Gomes PAJ, Reis AD (2008) Experimental performance evaluation of a very high speed free space optics link at the university of Beira interior campus: a case study. In: Proc. SEONs 2008- VI Symposium on Enabling Optical Network and Sensors Porto, Portugal, 20–20 June, pp 131–132 11. Pacheco de Carvalho JAR, Veiga H, Gomes PAJ, Ribeiro Pacheco CFFP, Reis AD (2008) Experimental performance study of a very high speed free space optics link at the university of Beira interior campus: a case study. In: Proc. ISSPIT 2008-8th IEEE International Symposium on Signal Processing and Information Technology Sarajevo. Bosnia and Herzegovina, December 16–19, pp 154–157 12. Pacheco de Carvalho JAR, Marques N, Veiga H, Ribeiro Pacheco CF, Reis AD (2010) Field performance measurements of a Gbps FSO link at Covilha City, Portugal. Lecture notes in engineering and computer science: Proceedings of the world congress on engineering, WCE 2010, Vol I, 30 June–2 July, London, UK, pp 814–818 13. Web site http://www.fsona.com; SONAbeam 1250-S technical data; SONAbeam Terminal Controller management software 14. Web site http://www.alvarion.com; Breeze NET B100 data sheet 15. Web site http://dast.nlanr.net; Iperf software 16. Web site http://www.enterasys.com; C2 switch technical manual 17. Network Working Group. RFC 1889-RTP: A Transport Protocol for Real Time Applications, http://www.rfc-archive.org 18. Web site http://www.cisco.com; Cisco Call Manager; Cisco IP Communicator 19. Web site http://www.counterpath.com; eyeBeam Softphone CounterPath software

Chapter 12

A Multi-Classifier Approach for WiFi-Based Positioning System Jikang Shin, Suk Hoon Jung, Giwan Yoon and Dongsoo Han

Abstract WLAN fingerprint-based positioning systems are a viable solution for estimating the location of mobile stations. Recently, various machine learning techniques have been applied to the WLAN fingerprint-based positioning systems to further enhance their accuracy. Due to the noisy characteristics of RF signals as well as the lack of the study on environmental factors affecting the signal propagation, however, the accuracy of the previously suggested systems seems to have a strong dependence on numerous environmental conditions. In this work, we have developed a multi-classifier for the WLAN fingerprint-based positioning systems employing a combining rule. According to the experiments of the multi-classifier performed in various environments, the combination of the multiple numbers of classifiers could significantly mitigate the environment-dependent characteristics of the classifiers. The performance of the multi-classifier was found to be superior to that of the other single classifiers in all test environments; the average error distances and their standard deviations were much more improved by the multi-classifier in all test environments. J. Shin (&) S. H. Jung Department of Information and Communications Engineering, Korea Advanced Institute of Science and Technology, 373-1 Kusong-Dong, Yuseong-gu, Daejeon, 305-701, Korea e-mail: [email protected] S. H. Jung e-mail: [email protected] G. Yoon Department of Electrical Engineering, Korea Advanced Institute of Science and Technology, 373-1 Kusong-Dong, Yuseong-gu, Daejeon, 305-701, Korea e-mail: [email protected] D. Han Department of Computer Science, Korea Advanced Institute of Science and Technology, 373-1 Kusong-Dong, Yuseong-gu, Daejeon, 305-701, Korea e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_12, Ó Springer Science+Business Media B.V. 2011

135

136

J. Shin et al.

12.1 Introduction With the explosive proliferation of smart phones, WLAN (Wireless Local Area Network)-based positioning systems have increasingly become a main stream in Location-based Service (LBS) regimes. Compared with other technologies such as GPS [1], RFID [2], GSM [3], Ultrasonic [4], infrared-based systems [5], etc., the WLAN-based positioning systems have some advantages in terms of coverage and costs. Most of the researches on the WLAN-based positioning systems have used the so-called Received Signal Strength Indication (RSSI) from the wireless network access points mainly because the RSSI (or called fingerprint) is relatively easy to obtain using software and also one of the most relevant factors for positioning. Some studies have been reported that consider other factors such as Signal to Noise Ratio (SNR), Angle of Arrival (AOA), and Time of Arrival (TOA) for positioning systems. Milos et al. [6] examined the SNR as an additional input factor and reported that the consideration of both SNR and RSSI could increase the performance of the WLAN-based positioning system. Yamasaki et al. [7] reported that the AOA and TOA are also important factors in positioning. However, the acquisition of the factors including the AOA, TOA, and SNR are not always possible in every wireless network interface cards. Thus, the RSSI appears to have been adopted as a primary factor for the WLAN-based positioning systems. In fact, utilizing the strengths of Radio Frequency (RF) signals for the positioning may not be a simple work. Due to the intrinsic characteristics of the RF signals like multipath fading and interference between signals, the signal strength may severely change depending on the materials used, the positions of doors and windows, the widths of the passages, the numbers of APs deployed, etc. Even if the fundamental parameters are known previously, the derivation of the path loss function of a WLAN signal is extremely complex. In this reason, the WLAN fingerprint-based positioning systems have mostly used to take statistical approaches [6]. The statistical approaches previously suggested have applied various machine learning techniques to derive the positions from the measured fingerprints [2, 8–15]. Those techniques usually are comprised of two phases: off-line and on-line phases. In the off-line phase, fingerprints are captured at various positions of target place and stored in a database called a radio-map. In the on-line phase, the location of a fingerprint is estimated by comparing it with the stored fingerprints in the database. The main problem of the WLAN fingerprint-based positioning systems is that the system performance is too much environment-dependent; in other words, there are not yet any general solutions available for the WLAN fingerprint-based positioning systems. Each system is designed to tackle different environments, and there is no analysis on the relation between the algorithm used and the test

12

A Multi-Classifier Approach for WiFi-Based Positioning System

137

environments. One method may outperform other methods in an environment, but it may show inferior results in other environments. For instance, Youssef et al. [12] suggested a joint-clustering technique and confirmed in their evaluation that their proposed algorithm outperformed RADAR [2]. According to the experiment by Wilson et al. [11], however, the RADAR was found to have a superior performance as compared to the joint-clustering technique. Similarly, this kind of problem was also observed in our experiments. In this paper, we introduce a multi-classifier for the application of the WLAN fingerprint-based positioning systems. We have combined multiple classifiers to become an efficient environment-independent classifier that can realize the more stable and higher estimation accuracy in a variety of the environments. The motivation for using a multiple number of classifiers lies in the fact that the classifier performance is severely environment-dependent; thus, if we can select the most accurate classifier for a given situation, we may be able to achieve even better performance in diverse environments. In this work, a multiple number of classifiers were combined using the Bayesian combination rule [16] and majority vote [17]. To prove the combination effects of the classifiers, we have evaluated the proposed system in three different environments. The evaluation results revealed that the multi-classifier could outperform the single classifiers in terms of the average error distances and their standard deviations. This indicates that the proposed combining method is much more effective in mitigating the environment-sensitive characteristics of the WLANbased positioning systems. The remainder of this paper is organized as follows. The overview on the WLAN fingerprint-based positioning is given in Sect. 12.2. We introduce a multiclassifier for the WLAN fingerprint-based positioning systems in Sect. 12.3. Section 12.4 describes the experiment setup and results. Section 12.5 summarizes this work and suggests the future work.

12.2 Related Work The location estimation using the so-called WLAN fingerprint often refers to the machine learning problem due to the high complexity of the signal propagation estimation. In this reason, various machine learning techniques have been applied. The RADAR system developed by Bahl et al. [2] is considered one of the most representative WLAN fingerprint-based systems. In this system, the authors used the Pentium-based PCs as access points and also the laptop computers as mobile devices. The system uses the nearest neighbor heuristics and triangulation methods to infer a user location. It maintains the radio map which can chart the strength of the signals received from the different access points at some selected locations. Each signal-strength measurement is then compared against the radio map, and

138

J. Shin et al.

then the best matching positions are averaged, enabling the location estimation. Roos et al. [10] proposed the probability-based system which uses the received signal strength samples to create the probability distributions of the signal strength for some known locations. Once an input instance is given, it matches to these probability distributions to find out the location of the mobile device with the highest probability. The histogram method suggested by Castro et al. [18] is another example of the probability-based system. Instead of using Gaussian distribution, it derives the distribution of the signal strength from the learning data. In addition, the adaptive neural networks [13], decision tree [14, 15], and support vector machine [19] are popular on the WLAN-based positioning systems; Kushki et al. [8] suggested the kernelized distance calculation algorithm for the inference of the location of the measured RSSI. Recently, some researchers have focused on compensating the characteristics of the RF signals. Berna et al. [20] suggested the system using the database by considering the unstable factors related to open/close doors and humidity changing environments. They utilized some sensors to capture the current status of the environment. Yin [15] introduced the learning approach based on the temporally updated database in accordance with the current environment situation. Moraes [21] investigated the dynamic RSS mapping architecture. By Wilson Yeung et al. [11], the use of the RSSI was suggested that are transmitted from the mobile devices as an additional input. Thus, there are two types of databases: the RSSI transmitted by APs and the RSSI transmitted by mobile devices. In the on-line phase, the system inferences the multiple results from the databases and makes the final decision using the combining method. Some research efforts [12, 22] have tackled the issue on how to reduce the computational overhead mainly because the client devices are usually small, selfmaintained and stand-alone, having a significant limitation in their power supply. Youssef et al. [12] developed a joint-clustering technique for grouping some locations in order to reduce the computational cost of the system. In this method, a cluster is defined as a set of locations sharing the same set of access points. The location determination process is as follows: for a given RSSI data set, the strongest access points are selectively used to determine one cluster to search the most probable location. Chen et al. [22] suggested the method which selects the most discriminative APs in order to minimize the AP numbers used in the positioning system. This approach selects an appropriate subset of the existing features to the computational complexity problem. Reducing the number of APs is referred to as the dimension reduction in a signal space, which in turn reduces the computational overheads required on the mobile devices. The weak spot of the WLAN fingerprint-based positioning systems is that their performance is severely environment-dependent. One system may outperform the other methods in an environment; it may show an inferior performance in other environments. To solve this problem, we suggest a multi-classifier approach for the application of the WLAN fingerprint-based positioning systems, leading to the more accurate results.

12

A Multi-Classifier Approach for WiFi-Based Positioning System

139

12.3 Proposed Method We utilize the multiple numbers of classifiers using different algorithms to build a possibly environment-independent classifier [23]. The work of combining multiple numbers of classifiers to create a strong classifier has been a well-established research, particularly in the pattern recognition area, the so-called Multiple Classifier System (MCS) [24]. When it comes to the term ‘‘combining’’, it indicates a processing of selecting the most trustable prediction results attained from the classifiers. At least, two reasons may justify the necessity of combining multiple classifiers [25]. First, there are a number of classification algorithms available that were developed from different theories and methodologies for the current pattern recognition applications. For a specific application problem, usually, each one of these classifiers could reach a certain degree of success, but maybe none of them is totally perfect or at least one of them is not so good as expected in practical applications. Second, for a specific recognition problem, there are often many types of features which could be used to represent and recognize some specific patterns. These features are also represented in various diversified forms and it is relatively hard to lump them together for one single classifier to make a decision. As a result, the multiple classifiers are needed to deal with the different features. It also results in a general problem on how to combine those classifiers with different features to yield the improved performance. The location estimation using the WLAN fingerprint often refers to the classification problem because of the noisy characteristics of the RF signals. Many algorithms have been proposed based on the different machine learning techniques, but none of them could achieve the best performance in very diverse environments. At this point, we realized that utilizing the multiple numbers of classifiers could be a promising solution, as a general solution for the WLAN fingerprintbased positioning systems. In this work, we combined the Bayesian combination rule [16] and majority vote [17] for our multi-classifier. The Bayesian combination rule gives weights to the decisions of classifier based on the information in a basis prepared in learning phase. Usually, the basis is given in a form of matrix called a confusion matrix. The confusion matrix is constructed by the cross-validation with learning data in the off-line phase. The majority vote is a simple algorithm, which chooses the one selected by more than a half of the classifiers. Figure 12.1 illustrates the idea of our proposed system. In the off-line phase, the fingerprints are collected over the target environment as learning data. The fingerprint is a collection of the pair-wise data containing the MAC address of an access point and its signal strength. Usually, in one fingerprint, there are multiple tuples of this pair-wise data such as f\ap1 ; bssi1 [; \ap2 ; bssi2 [; \ap3 ; bssi3 [ . . .: g. After attaching the collected location labels to the fingerprints, the database stores the labeled-fingerprint data. After collecting the learning data, each classifier C constructs their own confusion matrix M (Fig. 12.2) using the cross-validation with the learning data. The

140

Fig. 12.1 The overview of multi-classifier

Fig. 12.2 An example of confusion-matrix

J. Shin et al.

12

A Multi-Classifier Approach for WiFi-Based Positioning System

141

confusion matrix would be used as an indicator of its classifier. If there are L possible locations in the positioning system, the M will be a L L matrix in which the entry Mi,j denotes the number of the instances collected in location i, that is assigned as location j by the classifier. From the matrix M, the total number of data collected in location i can be P obtained as a row sum Li¼1 Mi;j , and the total number of data assigned to location P j can be obtained as a column sum Lj¼1 Mi;j When there are K classifiers, there would be K confusion matrices MðkÞ ; 1 k k. In the on-line phase, for the measured Fingerprint x, the positioning results gained by K classifiers are Ck ðxÞ ¼ jk ; 1 k k, and the jk can be any location of the L possible locations. The probability that the decision made by the classifier Ck is correct can be measured as follows: uðjk Þ ¼ Pðx 2 jk jC1 ðxÞ ¼ j1 ; . . .; Ck ðxÞ ¼ jk Þ

ð12:1Þ

Equation 12.1 is called the belief function, and the value of this function is called the belief value. Assuming that all classifiers are independent each other, and applying the Bayes’ theorem to Eq. 12.1, the belief function uðjk Þ can be reformulated as: uðjk Þ ¼

K Y Pðx 2 jk \ Ci ðxÞ ¼ ji Þ i¼1

PðCi ðxÞ ¼ ji Þ

ð12:2Þ

The denominator and numerator in Eq. 12.2 can be calculated using the confusion matrix M. The denominator indicates the probability that the classifier ci will assign the unknown fingerprint x to ji . This can be presented as follows: PL j¼1 Mi;j PðCi ðxÞ ¼ ji Þ ¼ P L ð12:3Þ i;j¼1 Mi;j The numerator in the Eq. 12.2 means the probability that the classifier ci will assign the unknown fingerprint x collected in jk to ji . This term is simply described as below: Mj ;j Pðx 2 jk \ Ci ðxÞ ¼ ji Þ ¼ PL k i i;j¼1 Mi;j

ð12:4Þ

After applying Eqs. 12.3 and 12.4 to Eq. 12.2, Eq. 12.2 can be reformulated as: uðjk Þ ¼

K Y

M P L jk ;ji j¼1 Mi;j i¼1

If more than a half of estimation of the classifiers pointed a specific location, the location would be selected as the final result. Otherwise, the belief value of each prediction is calculated, and the location with the highest belief value would be the final result. In case there are many locations with the same highest belief value, the

142

J. Shin et al.

multi-classifier system determines the middle point of those locations as the final result. For example, assume that there are three classifiers, a, b, and c, and there are three possible locations, location 1, location 2 and location 3. After the off-line phase, the confusion matrices will be as follows: 1 0 18 4 7 C B MðaÞ ¼ @ 2 12 3 A 0 4 10 1 0 12 6 6 C B MðbÞ ¼ @ 3 9 3 A 0

2

5

11

14

2

2

B MðcÞ ¼ @ 4 2

11 7

1

C 5 A 13

If the classifiers a, b, and c assigned the unknown instance x to location 1, location 2, and location 3, respectively, the belief values of the predictions can be calculated as follows: 18 3 2 108 ¼ 29 15 22 9570 4 9 7 252 uðjb Þ ¼ ¼ 29 15 22 9570 7 3 13 273 ¼ uðjc Þ ¼ 29 15 22 9570

uðja Þ ¼

The multi-classifier assigns the location 3 to the unknown instance x, because the jc , the prediction of the classifier c, has the highest belief value.

12.4 Evaluation 12.4.1 Experimental Setup The performance of WLAN-based positioning systems depends on each environment itself where the evaluation is performed. In this reason, we evaluated the proposed multi-classifier in three different environments; Table 12.1 briefly illustrates the test environments. The testbed 1 implies an office environment; the dimension of the corridor in the office is 3 9 60 m. The office is on the third floor of the faculty building at the KAIST-ICC in Daejeon, South Korea. In the corridor, we have collected 100 samples of Fingerprints from 60 different locations. Each location is 1 m away from each other. The testbed 2 indicates another office

12

A Multi-Classifier Approach for WiFi-Based Positioning System

Table 12.1 Summary of testbeds

Type Dimension (m) Number of AP Distance between RP (m) Number of APs deployed Avg. number of APs in one sample Std.Dev of number of AP in sample

143

Testbed 1

Testbed 2

Testbed 3

Corridor 3 9 60 60 1 48 16.6

Corridor 4 9 45 45 1 69 16.8

Hall 15 9 15 25 3 36 13.9

1.89

4.24

3.48

environment where the dimension of the corridor is 4 9 45 m. The office is located on the second floor of the Truth building at the KAIST-ICC. We have collected 100 samples of the Fingerprint from 45 different locations. Each location is 1 m away from each other. The testbed 3 implies a large and empty space inside the building located at the first floor of the Lecture building at the KAIST-ICC. The dimension of the space is 15 9 15 m. In the testbed 3, we have collected 100 samples of the Fingerprints from 25 different locations. Each location is 3 m away from each other. Comparing the testbed 3 case with testbed 1 and 2 cases, there is no attenuation factors that may disturb any signal propagation. To collect the data, we have adopted the HTC-G1 mobile phone with Android 1.6 platform, and used the API provided by the platform. We have also used the half (50%) of the collected data as the learning data and the rest of data were used as the test data. To prove the better performance of the multi-classifier, we created the multi-classifier with three classifiers, k-NN (with k ¼ 3) [2], Bayesian [9], and Histogram classifiers [10]; the performance of the multi-classifier was compared with these three classifiers, as shown in Table 12.2.

12.4.2 Results We can observe from the results that none of the single classifier outperformed others in all three test environments. These results indicate that the performance of the WLAN fingerprint-based positioning systems is sensitively related to the environments and the multi-classifier is turned out to be much more effective in mitigating such characteristics of the WLAN signals. Figure 12.3 reports the average error distance with respect to the different numbers of APs. From the Fig. 12.3a and b, the performances of the classifiers are quite different according to the test environments. Although the testbed 1 and testbed 2 look similar each other in indoor environments, the performances in testbed 1 are better than those in testbed 2. Especially, the average error distance of k-NN classifier in testbed 1 was 1.21 m when 15 APs were used for positioning,

144

J. Shin et al.

Table 12.2 Summary of testbeds (meter) Avg Testbed 1

Testbed 2

Testbed 3

k-NN Histogram Bayesian Multi k-NN Histogram Bayesian Multi k-NN Histogram Bayesian Multi

4.0 2.6 2.8 2.4 1.3 2.0 1.3 1.1 4.8 5.8 5.6 4.5

Std.Dev

Max

Min

90th Percentile

5.3 3.8 3.9 3.6 3.0 2.5 1.8 1.6 4.5 4.6 5.1 4.5

43 29 25 25 44 26 17 13 22.5 22.5 22.5 22.5

0 0 0 0 0 0 0 0 0 0 0 0

12.0 7.0 7.0 7.0 3.0 5.0 3.0 3.0 18.03 20.62 21.21 18.03

Fig. 12.3 Average error distance versus number of AP used for positioning in a Testbed 1, b Testbed 2, and c Testbed 3 respectively

while it was 4.6 m in testbed 2. In case of the histogram classifier, the average error distances were 1.9 and 2.7 m with 15 APs in testbed 1 and testbed 2, respectively. With the same condition, the Naïve Bayesian classifier’s average error distances in the testbeds 1 and 2 were 1.25 and 2.47 m, respectively. Compared with other classifiers, the multi-classifier showed the more improved results. In the testbeds 1 and 2, the average error distances of the multi-classifier

12

A Multi-Classifier Approach for WiFi-Based Positioning System

145

with 15 APs were 1.1 and 2.3 m, respectively. In the testbed 3, the accuracies of all classifiers are extremely poorer than the results in other testbeds. Based on the findings, it is believed that the WLAN fingerprint-based positioning systems can show better performance in the office environments as compared to the hall environments involving a few attenuation factors. As shown in the Fig. 12.3, the multi-classifier may clearly mitigate the environment-dependent characteristics of the single classifier. From the results shown in Fig. 12.3, we can conclude that the multi-classifier is effective for reducing error distance in localization. Table 12.2 illustrates the performance summary of the classifiers. The standard deviation of the errors of the multi-classifier in the testbed 1 was 3.6 m, while the k-NN, Histogram, and Bayesian respectively showed 5.8, 3.8, and 3.9 m in their standard deviations. In the testbed 2, the standard deviations of the error of all classifiers were lower than the values in the testbed 1. The standard deviation of kNN, Histogram and Bayesian were 3.0, 2.5, and 1.8 m, respectively. The standard deviation of the error of the k-NN, histogram, and Bayesian classifier in testbed 3 were 4.5, 4.6, and 5.1 m, respectively. These results confirm that the standard deviation of the errors of WLAN fingerprint-based positioning systems is also dependent on the environments. The proposed multi-classifier outperformed others in all testbeds in terms of the standard deviations of the error. In testbed 1, 2, and 3, the standard deviations of the errors of the multi-classifiers were 3.6, 1.6, and 4.5, respectively, which are higher or equivalent performance compared with others. From the results, we confirmed that multi-classifier could mitigate the environment-dependent characteristics of the single classifier, and the performance of the multi-classifier was better than the others in all environments. Even if the improvement of performance was not remarkable, the results indicate that combining a number of classifiers is one of the promising approaches in constructing reliable and accurate WLAN fingerprint-based positioning systems.

12.5 Summary and Future Work In this paper, we have presented an environment-independent multi-classifier for the WLAN fingerprint-based positioning systems in an effort to mitigate the undesirable environmental effects and factors. We have developed a combining method of the multiple numbers of classifiers for the purpose of the error-correction. For example, if a single classifier predicted wrong, the other classifiers correct it. In other words, the classifiers in the multi-classifier can complement each other. We have evaluated the multi-classifier in three different environments with various environmental factors: the numbers of APs, the widths of corridor, the materials used, etc. The multi-classifier was constructed with three different classifiers; k-NN (with k ¼ 3), Bayesian, and Histogram classifiers. As a result, the multi-classifier showed a consistent performance in the diverse test environments while other classifiers showed an inconsistent performance. The performance of

146

J. Shin et al.

the multi-classifier tends to follow that of the single classifier showing the best performance. This means that the classifiers in the multi-classifier complement each other, and thus the errors are more effectively corrected. For the next step, we are going to investigate a more efficient combining rule. In this work, we have mixed the Bayesian combining rule and majority vote; however, the performance enhancement was too marginal. Considering the complexity overhead of using the multiple numbers of classifiers, the multi-classifier may not be a cost-effective approach. Finding the best combination of the classifiers will be another direction of our future work. We have tested only three classifiers, and two of them have taken similar approaches; the fingerprint is the only feature for positioning. There are numbers of systems considering various aspects of WLAN signals that use additional features. In the near future, we are going to implement and evaluate the multi-classifier with various types of classifiers. Acknowledgments This research was supported by the MKE(The Ministry of Knowledge Economy), Korea, under the ITRC(Information Technology Research Center) support program supervised by the NIPA(National IT Industry Promotion Agency) (NIPA-2010-(C1090-10110013)), and by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MEST) (No. 2008-0061123).

References 1. Enge P, Misra P (1999) Special issue on global positioning system. Proc IEEE 87(1):3–15 2. Bahl P, Padmanabhan V (2000) RADAR: an in-building RF-based user location and tracking system. Proc IEEE Infocom 2:775–784 3. Drane C, Macnaughtan M, Scott C (1998) Positioning GSM telephones. IEEE Commun Mag 36(4):46–54 4. Priyantha N, Chakraborty A, Balakrishnan H (2000) The cricket location-support system. In: Proceedings of the 6th Annual International Conference on Mobile Computing and Networking, pp 32–43 5. Want R, Hopper A, Falcão V, Gibbons J (1992) The active badge location system. ACM Trans Inf Syst (TOIS) 10(1):102 6. Borenovic M, Neskovic A (2009) Comparative analysis of RSSI, SNR and noise level parameters applicability for WLAN positioning purposes. In: Proceedings of the IEEEEUROCON, pp 1895–1900 7. Yamasaki R, Ogino A, Tamaki T, Uta T, Matsuzawa N, Kato T (2005) TDOA location system for IEEE 802.11 b WLAN. In: Proceedings of IEEE. WCNC’05, pp 2338–2343 8. Kushki A, Plataniotis K, Venetsanopoulos A (2007) Kernel-based positioning in wireless local area networks. IEEE Trans Mobile Comput 6(6):689–705 9. Madigan D, Elnahrawy E, Martin R (2005) Bayesian indoor positioning systems. In: Proceedings of INFOCOM, pp 1217–1227 10. Roos T, Myllymaki P, Tirri H, Misikangas P, Sievanen J (2002) A probabilistic approach to WLAN user location estimation. Int J Wirel Inf Netw 9(3):155–164 11. Yeung W, Ng J (2007) Wireless LAN positioning based on received signal strength from mobile device and access points. In: IEEE International Conference on Embedded and RealTime Computing Systems and Applications, pp 131–137

12

A Multi-Classifier Approach for WiFi-Based Positioning System

147

12. Youssef M, Agrawala A, Shankar A (2003) WLAN location determination via clustering and probability distributions. In: Proceedings of the First IEEE International Conference on Pervasive Computing and Communications, p 143 13. Borenovi M, Nekovic A, Budimir D (2009) Cascade-connected ANN structures for indoor WLAN positioning. Intell Data Eng Autom Learning-IDEAL 392–399 14. Chen Y, Yang Q, Yin J, Chai X (2006) Power-efficient access-point selection for indoor location estimation. IEEE Trans Knowl Data Eng 18(7):877–888 15. Yin J, Yang Q, Ni L (2008) Learning adaptive temporal radio maps for signal-strength-based location estimation. IEEE Trans Mobile Comput 7(7):869–883 16. Xu L, Krzyzak A, Suen C (1992) Methods of combining multiple classifiers and their application to hand writing recognition. IEEE Trans Syst Man Cybern 22:418–435 17. Kuncheva L (2001) Decision templates for multiple classifier fusion: an experimental comparison. Pattern Recogn 34(2):299–314 18. Castro P, Chiu P, Kremenek T, Muntz R (2001) A probabilistic room location service for wireless networked environments. In: Proceeding of the 3rd International Conference on Ubiquitous Computing, pp 18–34 19. Brunato M, Battiti R (2005) Statistical learning theory for location fingerprinting in wireless LANs. Comput Netw 47(6):825–845 20. Berna M, Lisien B, Sellner B, Gordon G, Pfenning F, Thrun S (2003) A learning algorithm for localizing people based on wireless signal strength that uses labeled and unlabeled data. In: Proceedings of IJCAI, pp 1427–1428 21. Moraes L, Nunes B (2006) Calibration-free WLAN location system based on dynamic mapping of signal strength. In: Proceedings of the 4th ACM International Workshop on Mobility Management and Wireless Access, pp 92–99 22. Chen Y, Yin J, Chai X, Yang Q (2006) Power efficient access-point selection for indoor location estimation. IEEE Trans Knowl Data Eng 1(18):878–888 23. Shin J, Han D (2010) Multi-classifier for WLAN fingerprint-based positioning system. Lecture notes in engineering and computer science: Proceedings of the World Congress on Engineering, WCE 2010, 30 June–2 July, London, UK, pp 768–773 24. Kittler J (1998) Combining classifiers: a theoretical framework. Pattern Anal Appl 1(1):18–27 25. Chen K, Wang L, Chi H (1997) Method of combining multiple classifiers with different features and their applications to text-independent speaker identification. Int J Pattern Recognit Artif Intell 11(3):417–445

Chapter 13

Intensity Constrained Flat Kernel Image Filtering, a Scheme for Dual Domain Local Processing Alexander A. Gutenev

Abstract A non-linear image filtering scheme is described. The scheme is inspired by the dual domain bilateral filter but owing to much simpler pixel weighting arrangement the computation of the result is much faster. The scheme relies on two principal assumptions: equal weight of all pixels within an isotropic kernel and a constraint imposed on the intensity of pixels within the kernel. The constraint is defined by the intensity of the central pixel under the kernel. Hence the name of the scheme: Intensity Constrained Flat Kernel (ICFK). Unlike the bilateral filter designed solely for the purpose of edge preserving smoothing, the ICFK scheme produces a variety of filters depending on the underlying processing function. This flexibility is demonstrated by examples of edge preserving noise suppression filter, contrast enhancement filter and adaptive image threshold operator. The latter classifies pixels depending on local average. The versatility of the operators already discovered suggests further potentials of the scheme.

13.1 Introduction The initial stimulus for the development of the proposed scheme arose from the need for noise suppressing, edge preserving smoothing filter with a quasi real-time performance. The literature on edge preserving smoothing is plentiful. The most successful methods employ a dual domain approach: they define the operation result as function of ‘‘distances’’ in two domains, spatial and intensity. The ‘‘distances’’ are measured from a reference pixel of the input image. Well known

A. A. Gutenev (&) Retiarius Pty Ltd., P.O. Box 1606, Warriewood, NSW 2102, Australia e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_13, Ó Springer Science+Business Media B.V. 2011

149

150

A. A. Gutenev

examples are SUSAN [1] or, in more general form, the bilateral filter [2]. The main design purpose of these filtering schemes was the adaptation of level of smoothing to the amount of detail available within the neighborhood of the reference pixel. The application of such schemes ranges from adaptive noise suppression to creation of cartoon-like scenes from real world photographs [3]. The main weakness of the bilateral filter is its slow execution speed due to exponential weighting functions applied to the image pixels in both spatial and intensity domains. There is a range of publications describing the ways of improving the calculation speed of the bilateral filter [4–6]. In this paper we shall see that the simplification of weighting functions in both spatial and intensity domains not only increases the speed of computation without loosing the essence of edge preserving smoothing, but also suggests a filter generation scheme, versatile enough to produce operators beyond the original task of adaptive smoothing.

13.2 Intensity Constrained Flat Kernel Filtering Scheme 13.2.1 Intensity Constrained Flat Kernel Filter as a Simplification of the Bilateral Filter The bilateral filter is considered here in the light of its original purpose: single pass application. The output of the bilateral filter [2] is given by the formula [5] Ipb ¼

1 X Gr ðjp qjÞ GrR jIp Iq j Iq ; Wpb q2S S

ð13:1Þ

where p and q are vectors describing the spatial position of the pixels p; q 2 S, where S is the spatial domain, the set of all possible pixel positions within the image, Ip and Iq are the intensities of the pixels at positions p and q, Ip, Iq 2 R, where R is the range or intensity 2 domain, the set of all possible intensities of the 1 x is the Gaussian weighting function, with separate image, Gr ðxÞ ¼ pﬃﬃﬃﬃ exp 2r 2 r 2p

weight parameters rS and rR for spatial and intensity components, Wpb ¼ P G p qj GrR Ip Iq is the normalization coefficient. q2S rS Formula (13.1) states that the resulting intensity Ibp of the pixel at position p is calculated as a weighted sum of intensities of all other pixels in the image with the weights decreasing exponentially with increase of the distance between the pixel at variable position q and the reference pixel at position p. The contributing distances are measured in both spatial and range domains. Owing to the digital nature of the signal, function (13.1) has a finite support and its calculation is truncated to that in the neighborhoods of the pixel at position p and intensity Ip. The size of the neighborhood is defined by parameters rS and rR and sampling rates in both spatial

13

Intensity Constrained Flat Kernel Image Filtering

151

Fig. 13.1 Components making the bilateral filter

and intensity domains. The computation scheme proposed below truncates (13.1) further by giving all pixels in the selected neighborhood the same spatial weight. Furthermore the intensity weighting part of (13.1) applied to the histogram of the neighborhood is reduced to a range constraint around the intensity Ip of the reference pixel. The idea is illustrated by Figs. 13.1 and 13.2. For simplicity a single-dimension signal is presented on the graphs. The components which make the output of the bilateral filter (Fig. 13.1) at a particular spatial position p are: (i) Part of the signal under the kernel centered at the pixel at p, (ii) Gaussian spatial weighting function with its maximum at pixel at p and ‘‘width’’ parameter rS, (iii) Histogram of the pixels under the kernel centered at pixel at p, (iv) Gaussian intensity weighting function with its maximum at Ip and ‘‘width’’ parameter rR. The components which make the proposed filtering scheme (Fig. 13.2) replace the components ii and iv, the Gaussians, with simple windowing functions. The flat

152

A. A. Gutenev

Fig. 13.2 Components making the Intensity constrained flat kernel filtering scheme

kernel works as a spatial filter selecting spatial information in the neighborhood of the reference pixel at p. This information in the form of a histogram is passed to the intensity filter, which limits the processed information to that in the intensity neighborhood of the reference pixel Ip. This is where the commonality between the bilateral and ICFK filtering schemes ends. For the ICFK scheme the result of the operation depends on the processing function applied to spatially pre-selected data. 8 < F Hvp KðpÞ ; if Hvp ðIp Þ 6¼ 1 ð13:2Þ IpICFK ¼ : G Hv ; if Hvp ðIp Þ ¼ 1; p

13

Intensity Constrained Flat Kernel Image Filtering

153

where Hvp is a histogram of the part of the image, which is masked by the kernel v with the centre at p, Hvp ðIp Þ is the pixel count of the histogram at the level Ip, Hvp jKðpÞ is the part of the histogram Hvp subject to constraint K(p). Introduction of the second function G, applied only when the intensity level Ip is unique within the region masked by the kernel, is a way of emphasizing the need for special treatment of potential outlayers. Indeed, if the intensity level of a pixel is unique within a sizeable neighborhood, the pixel most likely belongs to noise and should be treated as such. As will be shown below the selection of functions F and G, as well as the constraint K, defines the nature of the resulting filter, which includes but is not limited by adaptive smoothing. The output of the filter (13.2) also depends on the shape of the kernel v. Often in digital image processing, selection of a kernel shape is based on the speed of calculation of filter results as kernel scans across the image. In the case of ICFK filters, this translates into the speed of histogram updates during the scan. There is a significant number of publications [7–9] on methods of speeding up of histogram updates as a square kernel scans the image. In order to avoid shape distortion of the filter output it is more appropriate to use an isotropic kernel, a digital approximation of a circle. A method to speed up the histogram updates while scanning with an isotropic kernel is described in [10]. It is based on the idea proposed in [11]. In the analysis and examples below an isotropic kernel is used. Such a kernel is fully defined by its radius r. A few words have to be said about the choice of the constraint K(p). In the bilateral filter this role is played by the exponent. By separating the constraint function from the processing functions F and G an extra degree of freedom is added to the filtering scheme. One possible definition of K(p) is offered in Fig. 13.2, where the exponent is replaced by the window function with a fixed window size. K ð pÞ ¼ Ip d, where d is a fixed number that depends on the dynamic range of the source image. For example, for integral image types it is an integer. In some cases, when looking for dark features on a bright background one may want to employ stronger smoothing to the brighter part of the image and reduce smoothing as the intensity decreases. Then the constraint can take the form KðpÞ ¼ Ip Ip c;

ð13:3Þ

where c is a fixed ratio. Furthermore, one can make the constraint adaptive and for example shrink the domain of the function F as the variance within the area masked by the kernel increases: K ð pÞ ¼ Ip ½dmax a ðdmax dmin Þ;

154

A. A. Gutenev

where dmax and dmin are fixed minimum and maximum values for the intensity range, a¼

varðHvp Þ minq2S ðvarðHvq ÞÞ maxq2S ðvarðHvq ÞÞ minq2S ðvarðHvq ÞÞ

maxðvarðHvq ÞÞ minðvarðHvq ÞÞ 6¼ 0 q2S

q2S

var(Hvp Þ is the variance of the area under the kernel centered at p.

13.3 Operators Derived from Intensity Constrained Flat Kernel Filtering Scheme 13.3.1 Edge Preserving Smoothing Filter This filter can be considered a mapping of the bilateral filter into the ICFK filtering scheme. The functions F and G are given by the following formulae F ¼ Hvp jKðpÞ is the average intensity within that part of the histogram under the kernel mask, which satisfies the constraint K(p), G ¼ median Hvp ð13:4Þ is the median of the area under the kernel mask. The median acts as a spurious noise suppression filter. From a computational point of view, the update of the histogram as the kernel slides across the image is the slowest operation. It was shown in [10] that the updates of the histogram and the value of the median for an isotropic kernel can be performed efficiently and require O(r) operations, where r is the radius of the kernel. The edge preserving properties of the filter emanate from the adaptive nature of the function F. The histogram Hvp is a statistic calculated within the mask of neighborhood v of the pixel at p and comprises intensities of all pixels within that neighborhood. However, the averaging is applied only to the intensities, which are in a smaller intensity neighborhood of Ip constrained by K(p). Thus the output value is similar in intensity to Ip and intensity-similar features from the spatial IICFK p neighborhood are preserved in the filter output. If the level Ip is unique in the neighborhood, it is considered as noise and is replaced by the neighborhood median. An example of the application of the filter is given in Fig. 13.3. The condition (13.3) was used as a constraint. The filter is effective against small particle noise; such as noise produced by camera gain, where linear or median filters would not only blur the edges but would also create perceptually unacceptable noise lumps.

13

Intensity Constrained Flat Kernel Image Filtering

155

Fig. 13.3 Fragment of an underwater image 733 9 740 pixels with a large number of suspended particles and the result of application of the edge preserving smoothing filter with the radius r = 12, subject to intensity constraint K(p) = Ip ± Ip 0.09

Similarly to the bilateral filter, application of the proposed filter gives the areas with small contrast variation a cartoon-like appearance. Use of flat kernels for image smoothing was the first choice from the conception of image processing. Other filtering schemes also place some constraints on the pixels within the kernel mask. A good example is the sigma filter [12] and its derivatives [13]. The fundamental difference between the sigma filter and the proposed filter is in the treatment the pixels within the mask. The sigma filter applies the filtering action, mean operator to all the pixels within the mask, if the central pixel is within the certain tolerance, r range of the mean of the area under the mask, otherwise the filtering action is not applied and the pixel’s input value is passed directly to the output. In the proposed filter the Hamlet’s question, ‘‘to filter or not to filter’’ is never posed. The filtering action is always applied but only to the pixel subset, which falls within certain intensity range of the central pixel. Moreover the filter output depends only on that, reduced range, not whole region under the kernel mask as in the sigma filter.

13.3.2 Contrast Enhancement Filter for Low Noise Images The expression (13.2) is general enough to describe not only ‘‘smoothing’’ filters, but ‘‘sharpening’’ ones as well. Consider the following expression for the operator function F: 8 < min Hvp jKðpÞ ; if Ip Hvp ; ð13:5Þ F¼ : max Hv j if Ip Hvp p KðpÞ ; where Hvp is the average intensity of the area under the kernel v at p.

156

A. A. Gutenev

Fig. 13.4 An example of a dermatoscopic image 577 9 434 pixels of a skin lesion

For the purpose of noise suppression the function (13.4) is the recommended choice for G in (13.2). The function F pushes the intensity of the output to one of the boundaries defined by the constraint, depending on the relative position of the reference intensity Ip and the average intensity under the kernel. As any other sharpening operator, the operator (13.5) amplifies the noise in the image. Hence it is most effective on low noise images. Dermatoscopic images of skin lesions can make a good example of this class of images. Dermatoscopy or epiluminescence microscopy is a technique for imaging skin lesions using oil immersion. The latter is employed in order to remove specular light reflection from the skin surface. This technique has a proven diagnostic advantage over clinical photography where images are taken without reflection suppressing oil immersion [14, 15] (Fig. 13.4). Normally the technique uses controlled lighting conditions. With proper balance of light intensity and camera gain, images taken with digital cameras would have a very low level of electronic noise, while the specular reflection noise is removed by the immersion. An example of such an image is given in Fig. 13.5. Some of the lesions can have a very low inter-feature contrast. Thus both image processing techniques as well as visual inspections can benefit from contrast enhancement. The images in Figs. 13.6 and 13.7 show application of the filter (13.5) and clearly indicate that the constraint parameter c (13.3) gives a significant level of control over the degree of the enhancement. There is another property of this filter that is worth emphasizing: due to its intrinsic nonlinearity this filter does not produce any ringing at the edges it enhances. The proposed filter in spirit is not unlike the toggle contrast filter [16]. The difference lies in the degree of contrast enhancement, which in case of the proposed filter has an additional control, the intensity constraint K(p). This control allows making the contrast change as strong as that of toggle contrast filter or as subtle as no contrast change at all.

13

Intensity Constrained Flat Kernel Image Filtering

157

Fig. 13.5 The dermatoscopic image after application of the contrast enhancement filter with the radius r = 7, subject to intensity constraint K(p) = Ip ± Ip 0.03

Fig. 13.6 The dermatoscopic image after application of the contrast enhancement filter with the radius r = 7, subject to intensity constraint K(p) = Ip ± Ip 0.1

13.3.3 Local Adaptive Threshold If sharpening could be considered a dual operation to smoothing and a processing scheme producing a smoothing filter is naturally expected to produce a sharpening one, then here is an example of the versatility of the ICFK scheme and its ability to produce somewhat unexpected operators still falling within the definition (13.2). Consider a local threshold operator defined by the functions: ( 1; if Hvp 2 Hvp jKðpÞ ; F ¼G¼ ð13:6Þ 0; if Hvp 62 Hvp jKðpÞ where Hvp is the average intensity of the area under the kernel v at p. The operator (13.6) produces a binary image, attributing to the background the pixels at which local average for the whole area under the kernel v at p is outside the constrained part of the histogram. The detector (13.6) can be useful in

158

A. A. Gutenev

Fig. 13.7 Dermatoscopic image 398 9 339 pixels of a skin lesion with hair and overlay of direct application of the local adaptive threshold with kernel of radius r = 5 and intensity constraint K(p) = Ip ± Ip 0.2

Fig. 13.8 Overlay of application of the local adaptive threshold with kernel of radius r = 5 and intensity constraint K(p) = Ip ± Ip 0.2 followed by morphological cleaning

identifying the narrow linear features in the images. Here is an example, one of the problems in the automatic diagnosis of skin lesions using dermoscopy is removal of artifacts like hairs and oil bubbles trapped in the immersion fluid. The detector (13.6) can identify both of those features as they stand out on the local background. The left half of Fig. 13.7 shows the image with hair and some bubbles. In automated lesion diagnosis systems hair and the bubbles are undesirable artifacts which need to be detected as non-diagnostic features. Prior to application of the operator (13.6) the source image needs to be preprocessed in order to remove the ringing around the hairs caused by sharpening in the video capture device. The preprocessing consists in application of the edge preserving smoothing filter (13.2) with the kernel radius r = 3 and the intensity constraint (13.3) where c = 0.08. Direct application of filter

13

Intensity Constrained Flat Kernel Image Filtering

159

(13.6) to the preprocessed image gives the combined hair and bubble mask, which is presented as an overlay on the right of Fig. 13.7. Application of the same filter followed by post-cleaning, which utilizes some morphological operations is presented in Fig 13.8. The advantage of this threshold technique is in its adaptation to the local intensity defined by the size of the processing kernel. All ICFK filters described above are implemented and available as part of the Pictorial Image ProcessorÓ package at www.pic-i-proc.com. The significant part of this work was first presented in [17]. Acknowledgments The author thanks Dr. Scott Menzies from Sydney Melanoma Diagnostic Centre and Michelle Avramidis from the Skintography Clinic for kindly providing dermatoscopic images. Author is also grateful to Prof. H. Talbot for pointing out some similarities between the proposed filters and existing filters.

References 1. Smith SM, Brady JM (1997) SUSAN–a new approach to low level image processing. Int J Comput Vis 23(1):45–78 2. Tomasi C, Manduchi R (1998) Bilateral filtering for gray and color images. In: Proceedings of the 1998 IEEE International Conference on Computer Vision. Bombay, India, pp 839–846 3. Kang H, Lee S, Chui CK (2009) Flow based image abstraction. IEEE Trans Vis Comput Graph 16(1):62–76 4. Durand F, Dorsey J (2002) Fast bilateral filtering for the display of high-dynamic-range images. ACM Trans Graph 21(3):257–266 5. Paris S, Durand F (2009) A fast approximation of the bilateral filter using a signal processing approach. Int J Comput Vis 81(1):24–52 6. Elad M (2002) On the bilateral filter and ways to improve it. IEEE Trans Image Process 11(10):1141–1151 7. Gil J, Werman M (1993) Computing 2-D min, median and max. IEEE Trans Pattern Anal Mach Intell 15:504–507 8. Weiss B (2006) Fast median and bilateral filtering. ACM Trans Graph (TOG) 25(3):519–526 9. Perreault S, Hebert P (2007) Median filtering in constant time. IEEE Trans Image Process 16(9):2389–2394 10. Gutenev A, From isotropic filtering to intensity constrained flat kernel filtering scheme. IEEE Trans Image Process (submitted for publication) 11. van Droogenbroeck M, Talbot H (1996) Fast computation of morphological operations with arbitrary structural element. Patt Recog Lett 17:1451–1460 12. Lee JS (1983) Digital image smoothing and the sigma filter. Comp Vis Graph Image Proc 24(2):255–269 13. Lukac R et al (2003) Angular multichannel sigma filter. In: Proceedings. (ICASSP ‘03) IEEE international conference on acoustics, speech, and signal processing, vol 3, pp 745–748 14. Pehamberger H, Binder M, Steiner A, Wolff K (1993) In vivo epiluminescence microscopy: improvement of early diagnosis of melanoma. J Invest Dermatol 100:356S–362S 15. Menzies SW, Ingvar C, McCarthy WH (1996) A sensitivity and specificity analysis of the surface microscopy features of invasive melanoma. Melanoma Res 6:55–62 16. Kramer HP, Bruckner JB (1975) Iterations of a nonlinear transformation for enhancement of digital images. Pattern Recogn 7:53–58 17. Gutenev AA (2010) Intensity constrained flat kernel image filtering scheme—definition and applications. Lecture notes in engineering and computer science: Proceedings of the World Congress on Engineering, WCE 2010, vol I, 30 June–2 July, London, UK, pp 641–645

Chapter 14

Convolutive Blind Separation of Speech Mixtures Using Auditory-Based Subband Model Sid-Ahmed Selouani, Yasmina Benabderrahmane, Abderraouf Ben Salem, Habib Hamam and Douglas O’Shaughnessy

Abstract A new blind speech separation (BSS) method of convolutive mixtures is presented. This method uses a sample-by-sample algorithm to perform the subband decomposition by mimicking the processing performed by the human ear. The unknown source signals are separated by maximizing the entropy of a transformed set of signal mixtures through the use of a gradient ascent algorithm. Experimental results show the efficiency of the proposed approach in terms of signal-tointerference ratio (SIR) and perceptual evaluation of speech quality (PESQ) criteria. Compared to the fullband method that uses the Infomax algorithm and to the convolutive fast independent component analysis (C-FICA), our method achieves a better PESQ score and shows an important improvement of SIR for different locations of sensor inputs.

S.-A. Selouani (&) Université de Moncton, Shippagan campus, Shippagan, NB E8S 1P6, Canada e-mail: [email protected] Y. Benabderrahmane D. O’Shaughnessy INRS-EMT, Université du Québec, Montreal, H5A 1K6, Canada e-mail: [email protected] D. O’Shaughnessy e-mail: [email protected] A. B. Salem H. Hamam Université de Moncton, Moncton, E1A 3E9, Canada e-mail: [email protected] H. Hamam e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_14, Ó Springer Science+Business Media B.V. 2011

161

162

S.-A. Selouani et al.

14.1 Introduction The practical goal of blind source separation (BSS) techniques is to extract the original source signals from their mixtures and possibly to estimate the unknown mixing channel using only the information from the observed signal with no, or very limited, knowledge about the source signals and the mixing channel. For several years, the separation of sources has been a particularly active research topic [19]. This interest can be explained by the wide spectrum of possible applications, which includes telecommunications, acoustics, seismology, location and tracking targets of radar and sonar, separation of speakers (so-called ‘cocktail party problem’), detection and separation in communication systems for multiple access, etc. Methods to solve the BSS problem can be divided into methods using second-order [5] or higher-order statistics [8], the maximum likelihood principle [3], principle component analysis (PCA) and non-linear PCA [13], and independent component analysis (ICA) methods [11, 14, 18]. Another important category of methods is the subband BSS. Subband BSS has many advantages compared to the other frequencydomain BSS approaches regarding the well-known permutation ambiguity of frequency bins [1]. In fact, the subband BSS permutation problem is quite less critical since the number of subbands that could be permuted is obviously smaller than the number of frequency bins. In addition, using a decimation process for each subband can considerably reduce the computational load when compared to timedomain approaches (which could be a computationally demanding task in the case of real-room mixtures). In [2], the subband analysis/synthesis system uses a polyphase filterbank with oversampling and single side band modulation. In low frequency bands, longer unmixing filters with overlap-blockshift are used. In [15], the subband analysis filterbank is basically implemented as a cosine-modulated prototype filter. The latter is designed as a truncated sinc() function weighted by a Hamming window. In [20], the impulse responses of the synthesis filters are based on the extended lapped transform and are defined by using the cosine modulation function. In the approach reported in [18], analysis filters are obtained by a generalized discrete Fourier transform. Analysis and synthesis filters are derived from a unique prototype filter which can be designed by iterative least-squares algorithm with a cost function including a stopband attenuation. In the blind speech separation approach we propose, we separate mixed sources that are assumed to be statistically independent, without any a priori knowledge about original source signals sj ðnÞ; j 2 f1; . . .; Ng but using only observations xi ðnÞ; i 2 f1; . . .; Mg through M sensors. Such signals are instantaneously or convolutively mixed. In this work, we are concerned with the convolutive case, i.e. the blind separation of convolved sources of speech, where source signals are filtered by impulse responses hij ðnÞ; from source j to sensor i: We are interested by the indiscriminate approach of separation that offers the advantage of not being reliant on major assumptions on the mix: besides its overall structure, often assumed linear, no settings are supposed known. Mixtures in that case can be expressed under a vector notation as

14

Convolutive Blind Separation of Speech Mixtures

XðnÞ ¼

1 X

HðkÞSðn kÞ;

163

ð14:1Þ

k¼0

where XðnÞ ¼ ½x1 ðnÞ; . . .; xM ðnÞT is a vector of mixtures, SðnÞ ¼ ½s1 ðnÞ; . . .; sN ðnÞT is a vector of speech sources, and HðkÞ ¼ ½hij ðkÞ; ði; jÞ 2 f1; . . .; Mg f1; . . .; Ng is a matrix of FIR filters. To blindly estimate the sources, an unmixing process is carried out, and the estimated sources YðnÞ ¼ ½y1 ðnÞ; . . .; yN ðnÞT can be written as L1 X YðnÞ ¼ WðkÞSðn kÞ; ð14:2Þ k¼0

where WðkÞ ¼ ½wij ðkÞ; ði; jÞ 2 f1; . . .; Mg f1; . . .; Ng is the unmixing matrix linking the jth output yj ðnÞ with the ith mixture xi ðnÞ: Such matrix is composed of FIR filters of length L: Each element is defined by the vectors wij ðkÞ ¼ ½wij ð0Þ; . . .; wij ðL 1Þ; 8ði; jÞ 2 f1; . . .; Mg f1; . . .; Ng: To mitigate problems in both time and frequency domains, in next sections, a new framework for the BSS of convolutive mixtures based on subband decomposition using an ear-model based filterbank and information maximization algorithm is presented and evaluated. This chapter includes an extension of our previous work [6]. A new evaluation criteria is introduced, namely the PESQ, and a new set of experiments are carried out involving the well-known C-FICA method in different mixing conditions.

14.2 Proposed Method In this section, we introduce the convolutive mixture based on the head related transfer function (HRTF) that we used to evaluate the proposed method. Then we define the subband decomposition using the modeling of the mid-external ear and the basilar membrane that aims at mimicking the human auditory system (HAS). Afterwards, the learning rule performing the sources’ separation is introduced.

14.2.1 HRTF Mixing Model The perception of the acoustic environment or room effect is a complex phenomenon linked mainly to the multiple reflections, attenuation, diffraction and scattering on the constituent elements of the physical environment around the sound source that the acoustic wave undergoes in its propagation from source to ear. These phenomena can be modeled by filters representing diffraction, scattering and reflection that a sound wave sustains during its travel between its source and the entrance of the ear canal of the listener. These filters are commonly called the head related transfer function or HRTF [10]. The principle of measuring HRTF is to place microphones in the ears and record the signals corresponding to different source positions. The HRTF is the

164

S.-A. Selouani et al.

transfer function between the source signals and the signals at the ears. The HRTF is then considered as a linear and time-invariant system. Each HRTF is represented by an FIR filter (finite impulse response), causal and stable. In our experiments, sources are convoluted with impulse responses modeling the HRTF. We tested our overall framework with mixing filters measured at the ears of a dummy head. We selected impulse responses associated with source positions defined by various angle values in relation to the dummy head (see Fig. 14.4).

14.2.2 Subband Decomposition The proposed modeling of HAS consists of three parts that simulate the behavior of the mid-external ear, the inner ear and the hair-cells and fibers. The external and middle ear are modeled using a bandpass filter that can be adjusted to signal energy to take into account the various adaptive motions of ossicles. The model of inner ear simulates the behavior of the basilar membrane (BM) that acts substantially as a non-linear filter bank. Due to the variability of its stiffness, different places along the BM are sensitive to sounds with different spectral content. In particular, the BM is stiff and thin at the base, but less rigid and more sensitive to low frequency signals at the apex. Each location along the BM has a characteristic frequency, at which it vibrates maximally for a given input sound. This behavior is simulated in the model by a cascade filter bank. The number of filterbank depends on the sampling rate of the signals and on other parameters of the model such as the overlapping factor of the bands of the filters, or the quality factor of the resonant part of the filters. The final part of the model deals with the electromechanical transduction of hair-cells and afferent fibers and the encoding at the level of the synaptic endings [7, 21]. 14.2.2.1 Mid-External Ear The mid-external ear is modeled using a bandpass filter. For a mixture input xi ðkÞ; the recurrent formula of this filter is given by ð14:3Þ x0i ðkÞ ¼ xi ðkÞ xi ðk 1Þ þ a1 x0i ðk 1Þ a2 x0i ðk 2Þ; 0 where xi ðkÞ is the filtered output, k ¼ 1; . . .; K is the time index and K is the number of samples in a given block. The coefficients a1 and a2 depend on the sampling frequency Fs ; the central frequency of the filter and its Q-factor. 14.2.2.2 Mathematical Model of the Basilar Membrane After each frame is transformed by the mid-external filter, it is passed to the cochlear filter banks whose frequency responses simulate those of the BM for an auditory stimulus in the outer ear. The formula of the model is as follows:

14

Convolutive Blind Separation of Speech Mixtures

x00i ðkÞ ¼ b1;i x00i ðk 1Þ b2;i x00i ðk 2Þ þ Gi ½x0i ðkÞ x0i ðk 2Þ;

165

ð14:4Þ

and its transfer function can be written as: Hi ðzÞ ¼

Gi ð1 z2 Þ ; 1 b1;i z1 þ b2;i z2

ð14:5Þ

where x00i ðkÞ is the BM displacement which represents the vibration magnitude at position di and constitutes the BM response to a mid-external sound stimulus x0i ðkÞ: The parameters Gi ; b1;i and b2;i ; respectively the gain and coefficients of filter or channel i; are functions of the position di along the BM. Nc cochlear filters are used to realize the model. These filters are characterized by the overlapping of their bands and a large bandwidth. The BM has a length of 35 mm which is approximately the case for humans [7]. Thus, each channel represents the state of an approximately D ¼ 1:46 mm of the BM. The sample-by-sample algorithm providing the outputs of the BM filters is given as follows.

166

S.-A. Selouani et al.

14.2.3 Learning Algorithm After performing the subband decomposition, the separation of convolved sources per subband is done by the Infomax algorithm. Infomax was developed by Bell and Sejnowski for the separation of instantaneous mixtures [4]. Its principle consists of maximizing output entropy or minimizing the mutual information between components of Y [23]. It is implemented by maximizing, with respect to W; the entropy of Z ¼ UðYÞ ¼ UðWXÞ: Thus, the Infomax contrast function is defined as CðWÞ ¼ HðUðWXÞÞ;

ð14:6Þ

where HðÞ is the differential entropy, which can be expressed as HðaÞ ¼ E½Lnðfa ðaÞÞ; where fa ðaÞ denotes the probability density function of a variable a: The generalization of Infomax for the convolutive case is performed by using a feedforward architecture. Both causal and non-causal FIR filters are performed in our experiments. With real-valued data for vector X; entropy maximization algorithm leads to the adaptation of unmixing filter coefficients with a stochastic gradient ascent rule using a learning steepest l: Then, the weights are defined as follows: Wð0Þ ¼ Wð0Þ þ lð½Wð0ÞT UðYðnÞÞXT ðnÞÞ;

ð14:7Þ

wij ðkÞ ¼ wij ðkÞ lUðyi ðnÞÞxj ðn kÞ;

ð14:8Þ

and, 8k 6¼ 0;

where Wð0Þ is a matrix composed of unmixing FIR filters coefficients as defined in Sect. 14.1, YðnÞ and XðnÞ are the separated sources and the observed mixtures, respectively. UðÞ is the score function of yi which is a non-linear function approximating the cumulative density function of sources, as defined in Eq. 14.9, where pðyi Þ denotes the probability density function of yi Uðyi ðnÞÞ ¼

dpðyi ðnÞÞ dyi ðnÞ

pðyi ðnÞÞ

:

ð14:9Þ

The block diagram of the proposed method is given in Fig. 14.1. The input signals, that are the set of mixtures, are firstly processed by the mid-external ear introduced by Eq. 14.3. Then outputs are passed through a filterbank representing the cochlear part of the ear. A decimation process is then performed for each subband output. Such decimation is useful for many reasons. First, it improves the convergence speed because input signals are more whitened than the time domain approach. Second, the wanted unmixing filter length will be reduced by a factor of 1 M ; where M is the decimation factor. After performing decimation, we group a set of mixtures belonging to the same cochlear filter to be the input of the unmixing stage. The latter gives separated sources of each subband that are upsampled by a

14

Convolutive Blind Separation of Speech Mixtures

167

Fig. 14.1 The ear-based framework for the subband BSS of convolutive mixtures of speech

M factor. The same filter bank is used for the synthesis stage. The estimated sources are added from different synthesis stages.

14.3 Experiments and Results A set of nine different signals, consisting of speakers (three females and six males) reading sentences during approximately 30 s, was used throughout experiments. This speech signals were collected by Nion et al. [17]. The signals were downsampled to 8 kHz. The C-FICA algorithm (convolutive extension of Fast-ICA: independent component analysis) and the full-band Infomax algorithms are used as baseline systems for evaluation. The C-FICA algorithm proposed by Thomas et al. [22] consists of time-domain extensions of the fast-ICA algorithms developed by Hyvarinen et al. [11] for instantaneous mixtures. For an evaluation of the source contributions, C-FICA uses the criterion of least squares, whose optimization is carried out by a Wiener filtering process. The convolutive version of full-band Infomax introduced in Sect. 14.3 in the evaluation tests.

14.3.1 Evaluation Criteria To evaluate the performance of BSS methods, two objective measures were used namely the signal to interference ratio (SIR) and the perceptual evaluation of speech quality (PESQ). The SIR has been emphasized to be a most efficient criterion for several methods aiming at reducing the effects of interference [9]. The SIR is an important entity in communications engineering that indicates the quality of a speech signal between a transmitter and a receiver environment. It is selected as the criteria for optimization. This reliable measurement is defined by

168

S.-A. Selouani et al.

SIR ¼ 10 log10

jjstarget jj2 jjeinterf jj2

ð14:10Þ

;

where starget ðnÞ is an allowed deformation of the target source si ðnÞ; einterf ðnÞ is an allowed deformation of the sources which accounts for the interference of the unwanted sources. Those signals are derived from a decomposition of a given estimated source yi ðnÞ of a source si ðnÞ: The second measure used to evaluate the quality of source separation is the PESQ. The latter is normalized in ITU-T recommendation P.862 [12] and is generally used to evaluate speech enhancement systems [16]. Theoretically, the results can be mapped to relevant mean opinion scores (MOS) based on the degradation of the speech sample. The algorithm predicts subjective opinion scores for degraded speech samples. PESQ returns a score from 0.5 to 4.5. The higher scores suggest better quality. The code provided by Loizou in [16] is used in our experiments. In general, the reference signal indicates an original signal and the degraded signal indicates the same utterance pronounced by the same speaker as in the clean signal but submitted to diverse adverse conditions. In the PESQ algorithm, the reference and degraded signals are level-equalized to a standard listening level thanks to the pre-processing stage. The gain of the two signals may vary considerably, so it is a priori unknown. In the original PESQ algorithm, the gains of the reference, degraded and corrected signals are computed based on the root mean square values of band-passed-filtered (350–3,250 Hz) speech.

14.3.2 Discussion Different configurations of the subband analysis and synthesis stages as well as of the decimation factor have been tested. The number of subbands was fixed at 24. Through our experiments we observed that when we keep the whole number of subbands, the results were not satisfactory. In fact, we noticed that some subbands in high frequencies are not used, and therefore this causes distortions on the listened signals. However, as shown in Fig. 14.2, the best performance was achieved for Nc0 ¼ 24 and M ¼ 4: In addition to the use of causal FIR filters, we adapted unmixing stage weights for non-causal FIR by centering the L taps. From Fig. 14.2, we observe that causal FIR yields good results in SIR improvement when compared to non-causal one. Another set of experiments have been carried out to evaluate the performance in the presence of an additive noise in sensors. We used the signal-to-noise-ratio (SNR) which is defined in [9], by SNR ¼ 10 log10

jjstarget þ einterf jj2 jjenoise jj2

;

ð14:11Þ

14

Convolutive Blind Separation of Speech Mixtures

169

Fig. 14.2 SIR improvement for both causal and noncausal filters. We denote by Nc0 the number of filters that have been used among Nc filters and M is decimation factor

Fig. 14.3 SNR comparison between the subband and fullband methods

where enoise is an allowed deformation of the perturbating noise, starget and einterf were defined previously. Figure 14.3 shows the SNR improvement using our subband decomposition, comparing to the fullband method, i.e. Infomax algorithm in convolutive case.

170

S.-A. Selouani et al.

We have also compared the proposed method with the well-known C-FICA and fullband Infomax techniques using PESQ and SIR objective measures. Among the available data, we considered a two-input, two-output convolutive BSS problem. We mixed in convolution two speech signals pronounced by a man and a woman. We repeated this procedure with different couples of sentences. The average of evaluation measures (SIR and PESQ) were calculated. As illustrated in Fig. 14.4, we selected impulse responses associated with source positions defined by different angles in relation to the dummy head. As can be seen in Table 14.1, the proposed subband is efficient and has the additional advantage that the preprocessing step is not necessary. The method was also verified subjectively by listening to the original, mixed and separated signals. The PESQ scores confirm the superiority of the proposed method in terms of intelligibility and quality of separation when compared to baseline techniques. The best HRTF configuration Fig. 14.4 The convolutive model with source positions at 30 and 40° angles in relation to the dummy head

Table 14.1 SIR and PESQ of proposed subband BSS, C-FICA method and full-band BSS Angle (°) C-FICA Full-band BSS Proposed 10 10 10 60 20 50 20 120 30 80

PESQ

SIR

PESQ

SIR

PESQ

SIR

2.61 2.68 2.04 2.52 3.18 2.95 3.02 2.15 2.28 1.94

5.08 5.37 4.45 6.67 7.54 6.82 6.62 5.87 6.29 4.72

3.15 3.27 2.23 2.76 3.81 3.55 3.32 3.04 2.47 2.13

8.45 9.74 7.21 8.94 11.28 10.79 10.02 9.11 7.55 7.02

3.85 4.10 2.92 3.24 4.24 4.16 3.42 3.29 2.88 2.51

13.45 13.62 8.96 10.05 13.83 11.67 12.14 10.52 9.76 8.73

14

Convolutive Blind Separation of Speech Mixtures

171

was obtained for 20–50° angle of dummy head where a PESQ of 4.24 and a SIR of 13.83 dB were achieved.

14.4 Conclusion An ear-based subband BSS approach was proposed for the separation of convolutive mixtures of speech. The results showed that using a subband decomposition that mimics the human perception and using the Infomax algorithm yields better results than the fullband and C-FICA methods. Experimental results showed the high efficiency of the new method in improving the SNR of unmixed signals in the case of noisy sensors. It is worth noting that an important advantage of the proposed technique is that it uses a simple time-domain sample-by-sample algorithm to perform the decomposition and that it does not need pre-processing step.

References 1. Araki S, Makino S, Nishikawa T, Saruwarati H (2001) Fundamental limitation of frequency domain blind source separation for convolutive mixture of speech. In: IEEE-ICASSP conference, pp 2737–2740 2. Araki S, Makino S, Aichner R, Nishikawa T, Saruwatari H (2005) Subband-based blind separation for convolutive mixtures of speech. IEICE Trans Fundamentals E88-A(12):3593– 3603 3. Basak J, Amari S (1999) Blind separation of uniformly distributed signals: a general approach. IEEE Trans Neural Networks 10:1173–1185 4. Bell AJ, Sejnowski TJ (1995) An information maximization approach to blind separation and blind deconvolution. Neural Comput 7(6):1129–1159 5. Belouchrani A, Abed-Meraim K, Cardoso JF, Moulines E (1997) A blind source separation technique using second-order statistics. IEEE Trans Signal Process 45(2):434–444 6. Ben Salem A, Selouani SA, Hamam H (2010) Auditory-based subband blind source separation using sample-by-sample and Infomax algorithms. In: Lecture notes in engineering and computer science: proceedings of the World Congress on engineering, 2010, WCE 2010, 30 June–2 July, London, UK, pp 651–655 7. Caelen J (1985) Space/time data-information in the A.R.I.A.L. project ear model. Speech Commun J 4:163–179 8. Cardoso JF (1989) Source separation using higher order moments. In: Proceedings IEEE ICASSP, Glasgow, UK, vol 4, pp 2109–2112 9. Fevotte C, Gribonval R, Vincent E (2005) BSS_EVAL toolbox user guide. IRISA, Rennes, France, Technical Report 1706 [Online]. Available: http://www.irisa.fr/metiss/bss_eval 10. Gardner B, Martin K Head related transfer functions of a dummy head [Online]. Available: http://www.sound.media.mit.edu/ica-bench/ 11. Hyvärinen A, Karhunen J, Oja E (2001) Independent component analysis. Wiley, New York 12. ITU (2000) Perceptual evaluation of speech quality (PESQ), and objective method for end-toend speech quality assessment of narrowband telephone networks and speech codecs. ITU-T Recommendation 862 13. Karhunen J, Joutsensalo J (1994) Representation and separation of signals using nonlinear PCA type learning. Neural Networks 7:113–127

172

S.-A. Selouani et al.

14. Kawaguchi A (2010) Statistical inference for independent component analysis based on polynomial spline model. In: IANG conference, vol I, IMECS 2010, Hong Kong 15. Kokkinakis K, Loizou PC (2007) Subband-based blind signal processing for source separation in convolutive mixtures of speech. In: IEEE-ICASSP conference, pp 917–920 16. Loizou P (2007) Speech enhancement: theory and practice. CRC Press LLC, Boca Raton 17. Nion D, Mokios KN, Sidiropoulos ND, Potamianos A (2010) Batch and adaptive PARAFACbased blind separation of convolutive speech mixtures. IEEE Trans Audio Speech Language Process 18(6):1193–1207 18. Park HM, Dhir CS, Oh SH, Lee SY (2006) A filter bank approach to independant component analysis for convolved mixtures. Neurocomputing 69:2065–2077 19. Pedersen MS, Larsen J, Kjems U, Parra LC (2007) A survey of convolutive blind source separation methods. Springer handbook on speech processing and speech communication. Springer, Berlin 20. Russel I, Xi J, Mertins A, Chicharo J (2004) Blind source separation of non-stationary convolutively mixed signals in the subband domain. In: IEEE-ICASSP conference, pp 481– 484 21. Tolba H, Selouani SA, O’Shaughnessy D (2002) Auditory-based acoustic distinctive features and spectral cues for automatic speech recognition using a multistream paradigm. In: IEEEICASSP conference 2002, pp 837–840 22. Thomas J, Deville Y, Hosseini S (2006) Time-domain fast fixed-point algorithms for convolutive ICA. IEEE Signal Process Lett 13(4):228–231 23. Wax M, Kailath T (1985) Detection of signals by information theoretic criteria. IEEE Trans Acoust Speech Signal Process 33(2):387–392

Chapter 15

Time Domain Features of Heart Sounds for Determining Mechanical Valve Thrombosis Sabri Altunkaya, Sadık Kara, Niyazi Görmüsß and Saadetdin Herdem

Abstract Thrombosis of implanted heart valve is a rare but lethal complication for patients with mechanical heart valve. Echocardiogram of mechanical heart valves is necessary to diagnose valve thrombosis definitely. Because of the difficulty in making early diagnosis of thrombosis, and the cost of diagnosis equipment and operators, improving noninvasive, cheap and simple methods to evaluate the functionality of mechanical heart valves are quite significant especially for first step medical center. Because of this, time domain features obtained from auscultation of heart sounds are proposed to evaluate mechanical heart valve thrombosis as a simple method in this chapter. For this aim, heart sounds of one patient with mechanical heart valve thrombosis and five patients with normally functioning mechanical heart valve were recorded. Time domain features of recorded heart sounds, the skewness and kurtosis, were calculated and statistically evaluated using paired and unpaired t-test. As a result, it is clearly seen that the skewness of first heart sound is the most discriminative features (p \ 0.01) and it

S. Altunkaya (&) S. Herdem Department of Electrical and Electronics Engineering, Selçuk University, 42075 Konya, Turkey e-mail: [email protected] S. Herdem e-mail: [email protected] S. Kara Biomedical Engineering Institute, Fatih University, 34500 Istanbul, Turkey e-mail: [email protected] N. Görmüsß Department of Cardiovascular Surgery, Meram Medical School of Selcuk University, 42080 Konya, Turkey e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_15, Ó Springer Science+Business Media B.V. 2011

173

174

S. Altunkaya et al.

may be used fairly well in differentiating normally functioning mechanical heart valve from malfunctioning mechanical heart valve.

15.1 Introduction Mechanical heart valve thrombosis is any thrombosis attached to a mechanical valve, occluding part of the blood flow or interfering with valvular function [1]. The mechanical heart valve thrombosis is a critical complication relating to the high mortality and requires immediate diagnosis and thrombolytic or surgical treatment [2]. Progression in the structure and design of mechanical heart valve over the years has led to a considerable improvement in their hemodynamic features and durability. However, thromboembolic complications remain a troublesome cause of postoperative morbidity and mortality [3, 4]. According to different literature; incidence of thromboembolic complication ranges from 0.03 to 4.3% patient-years [2], 0.5–6% per patient-year [3], 2–4% patients per annum [5] 0.5% patient-years [6] depending on the generation and the thrombosis of the prosthesis used, the location of the valve, and the quality of the anticoagulation [2]. Recently, transesophageal echocardiography has become the gold standard both in the early diagnosis of prosthetic valve thrombosis and in risk stratification for obstruction or embolism in patients with prosthetic heart valves [7]. However, it is quite expensive to use echocardiography for diagnosis of mechanical heart valve thrombosis in the first step medical center because of both a specialist requirement to use echocardiography and cost of these devices. So it may cause misdiagnosis of thrombotic complications in the first step medical center. Therefore, improving noninvasive, cheap and simple methods to evaluate the functionality of the mechanical heart valve are quite important [8–11]. There are limited numbers of study to evaluate mechanical heart valve thrombosis using heart sounds. Although there are limited numbers of studies about frequency spectrum of mechanical heart valve sounds, it is known that thrombosis formation on a prosthetic heart valve changes the frequency spectrum of both biological and mechanical heart valve. Features obtained from frequency and time–frequency analysis of heart sounds are used to detect mechanical heart valve thrombosis in the past studies. In these past studies, generally modified forward–backward Proony’s Method was used to detect frequency component of prosthetic heart valve [8–10, 12, 13]. In this chapter, time domain features instead of frequency domain features are proposed to evaluate thrombosis on the mechanical heart valve. For this aim, heart sounds of one patient with thrombosis and heart sounds of five patients with normally functioning mechanical heart valve are recorded. The skewness and kurtosis of heart sounds as time domain features were found and statistically evaluated using t-test.

15

Time Domain Features of Heart Sounds

175

Table 15.1 Clinical information of patients Pat. no Sex Age Valve size (mm)

Valve type

Condition

1 2 3 4 5 6

Sorin Sorin Sorin St.Jude St.Jude Sorin

Normal Normal Normal Normal Normal With thrombosis

F F F F F F

58 27 35 30 45 55

25 29 29 29 29 29

15.2 Patients and Data Acquisition This study includes patients who were operated in the Department of Cardiovascular Surgery, Meram Medical School of Selcuk University. Five patients with normally functioning mechanical heart valve and one patient with mechanical valve thrombosis were selected to evaluate the mechanical heart valve thrombosis using heart sounds. The heart sounds of a patient with thrombosis were recorded before and after thrombolytic treatment. Functionality of the mechanical heart valve of patients was investigated using echocardiography by the physician. After echocardiography investigation, thrombus with partial obstruction was monitored on the mitral mechanical heart valve of a patient. The heart sounds of five patients with normally functioning mechanical heart valve were recorded after the heart valve replacement in one year. The heart sounds of patients recorded from mitral area (intersection of left 5. intercostals interval and mid clavicular line) over the entire course of 30 s [14]. All patients had mitral valve replacement and clinical information of these patients is shown on Table 15.1. ECG signals were recorded simultaneously with heart sounds to segment first and second heart sounds. E-Scope II electronic stethoscope manufactured by Cardionics was used to record heart sounds. Sound signals obtained from electronic stethoscope and ECG signals obtained from the surface electrode were digitized at a 5000 Hz sampling frequency using the Biopac MP35 data acquisition device.

15.3 Extraction of First Heart Sounds (S1) and Second Heart Sounds (S2) As mentioned in the previous section, 30 s heart sounds are recorded from each patient. One S1 and one S2 sound component available in the heart sounds signal for one heart beat. In this chapter, detection of S1 and S2 that is varying number according to the number of pulse are discussed. Known that, the S1 occurs after the onset of the QRS complex, the S2 occurs towards the end of the T wave of ECG. Using these two relations between heart sounds and ECG, S1 and S2 sounds obtained from 30 s record. Processing of recorded heart sounds signal can be

176

S. Altunkaya et al.

summarized as follows. Firstly, filtration and normalization of recorded heart sounds and ECG signal is performed. After that, QRS and T peak of ECG signal is detected. Finally, S1 and S2 sounds are detected using QRS and T peak [14].

15.3.1 Preprocessing of ECG and Heart Sounds Signals All recorded heart sounds were filtered with a 30 Hz high pass and 2000 Hz low pass digital finite impulse response filter to get rid of noise and were normalized using HSnorm ðnÞ ¼

HSðnÞ maxjHSðnÞj

ð15:1Þ

where HS(n) is the row heart sound signal and HSnorm(n) is the normalized heart sounds signal. Also, a normalizing process was applied to the ECG signal.

15.3.2 Detection of QRS Complex and T Peak QRS complex of the ECG signal are detected using a first-derivative based QRS detection algorithm. In this algorithm, the ECG signal is first band pass filtered with a pass band of 10–20 Hz to eliminate the baseline wander and high frequency noise. After filtering, the ECG signal is differentiated to obtain QRS complex slope information, is squared point by point to clarify the QRS complex in the ECG signal, and then is time-averaged by taking the mean of the previous 10 points. The timeaveraged ECG signal is compared to a threshold to obtain the QRS complex [15, 16]. The threshold is chosen to be a quarter of the maximum time averaged ECG signal. As a result of this comparison, the maximum of the time-averaged ECG signals greater than the threshold is accepted as R peak. After that, all intervals between consecutive R peaks (RR interval) are compared to 0.5 and 1.5 times the mean RR interval. If the RR interval is longer than 1.5 times of the mean RR interval and is shorter than 0.5 times the mean RR interval, then this RR interval and its counterpart of heart sounds is removed from a signal to prevent wrong detection of RR interval. T waves are detected using the physiological knowledge that the peak in the T wave occurs at least 60 ms after the R peak and is normally within the two-thirds of the RR interval. The maximum of ECG signal in these interval is used as a T peak to detect the location of S2 [17].

15.3.3 Detection of S1 and S2 The ECG signal and Shannon energy is used to detection of the heart sounds. The ECG signal is used as a time reference to determine the time interval

15

Time Domain Features of Heart Sounds

177

Fig. 15.1 ECG, heart sounds and Shannon energy of one patient with MVR

where S1 and S2 are searched over one heart cycle. The Shannon energy of heart sounds is used for exact determination of location of S1 and S2 in the finite interval. The Shannon energy of the normalized heart valve sound (HSnorm) can be calculated using SE ¼

N 1X HS2 ðnÞ log HS2norm ðnÞ N n¼1 norm

ð15:2Þ

where SE is Shanonn energy of HSnorm, N is length of recorded data and n is index of HSnorm [18]. After the Shannon energy of heart sounds is calculated, to determine exactly the location of S1 in the RR interval, the maximum point of the Shannon energy in the interval between 0 01 RR and 0.2 RR is accepted the center of the S1. The maximum point of Shannon’s energy between the times the ECG T peaks to the ECG T peak time plus 150 ms is accepted as the center of S2. The duration of S1 and S2 was chosen to be 150 and 75 ms respectively on both sides of the center (Fig. 15.1). If the Shannon energy of the right or left side of the center is larger than 40% of the maximum Shannon energy, the duration of chosen heart sounds is increased by 20%. The comparison is repeated until the Shannon energy of the right or left side of the center is smaller than 40% the maximum Shannon energy [17, 19]. In Fig. 15.1, the upper graph shows the RR interval of the ECG signal, the 0.3–0.65 RR interval, and between the times the ECG T peaks to the ECG T peak time plus 150 ms, the middle graph shows the heart sounds signal, and the bottom graph shows the calculated Shannon energy.

178

S. Altunkaya et al.

Table 15.2 Mean ± Standard deviation (std.) of the skewness and kurtosis of heart sounds Thr (mean ± std.) AThr (mean ± std.) N (mean ± std.) Skewness of S1 Skewness of S2 Kurtosis of S1 Kurtosis of S2

0.96±0.36 0.71±0.7 5.24±1.11 8.39±2.47

0.18±0.25 0.3±0.36 4.34±0.45 5.65±1.56

0.12±0.42 -0.2±0.54 5.9±1.75 5.79±1.92

15.4 Skewness and Kurtosis The change in the signal or distribution of the signal segments is measured in terms of the skewness and kurtosis. The skewness characterizes the degree of asymmetry of a distribution around its mean. The skewness is defined for a real signal as Skewness ¼

Eðx lÞ3 r3

ð15:3Þ

where l are the mean and r are the standard deviation and E denoting statistical expectation. The skewness shows that the data are unsymmetrically distributed around a mean. If the distribution is more to the right of the mean point the skewness is negative. If the distribution is more to the left of the mean point the skewness is positive. The skewness is zero for a symmetric distribution. The kurtosis measures the relative peakedness or flatness of a distribution. The kurtosis for a real signal x(n) is calculated using Kurtosis ¼

Eðx lÞ4 r4

ð15:4Þ

where l are the mean and r are the standard deviation and E denotes statistical expectation. For symmetric unimodal distributions, the kurtosis is higher than 3 indicates heavy tails and peakedness relative to the normal distribution. The kurtosis is lower than 3 indicates light tails and flatness [20, 21].

15.5 Result and Discussion There are approximately thirty-first heart sounds (S1) and 30 s heart sounds (S2) component in 30 s recording of heart sound of each patient. The skewness and kurtosis of this entire S1 and S2 component were calculated for all recorded heart sounds. Table 15.2 shows mean and standard deviation (std.) of the skewness and kurtosis of the heart sounds of one patient with mechanical heart valve thrombosis (Thr), the heart sounds of the same patients after thrombolytic treatment (AThr) and the heart sounds of five patients with normally functioning mechanical heart valve (N). Figure 15.2 illustrates the summary statistics for the skewness and kurtosis of S1 and S2 of Thr, AThr and N.

15

Time Domain Features of Heart Sounds

179

Fig. 15.2 Box plot for the skewness of S1, the skewness of S2, the kurtosis of S2, the kurtosis S1 (Thr: patient with mechanical valve thrombosis, AThr: patient after thrombolytic treatment and N: five patients with normally functioning mechanical valve)

From Table 15.3 and Fig. 15.2, it can be said that there are a meaningful differences between means of the skewness of S1 and S2 and S2 of the kurtosis of heart sounds of patients with normally and malfunctioning mechanical heart valves. The kurtosis of S1 has the similar mean for these heart sounds. However, it is clearly seen that the skewness of S1 is the best feature to show difference between the normally and malfunctioning mechanical heart valve. Paired t-test with 99% confidence level was used for comparison means of the skewness and kurtosis between heart sounds of a patient before and after thrombolytic treatment was administered. Unpaired t-test with 99% confidence was used for comparison means of the skewness and kurtosis between patients with normally functioning mechanical heart valve and patient with mechanical heart valve thrombosis (before treatment). These tests were applied to two features, the skewness and kurtosis, obtained from two sound components S1 and S2. p value obtained from above tests is shown on Table 15.3. The skewness of S2 only between Thr and N, the kurtosis of S1 only between Thr and AThr, the kurtosis of S2 only between Thr and N shows statistically significance differences (p \ 0.01). But the skewness of S1 shows statistically

180

S. Altunkaya et al.

Table 15.3 p value obtained from paired and unpaired t-test Between heart sounds of Thr and AThr (paired t-test)

Between heart sounds of Thr and N (unpaired t-test)

Skewness of S1 Skewness of S2 Kurtosis of S1 Kurtosis of S2

0.000006 0.0011 0.0712 0.0924

0.0000018 0.0199 0.0046 0.0016

Thr Patient with mechanical heart valve thrombosis AThr Patient with mechanical heart valve thrombosis after thrombolytic treatment N Five patients with normally functioning mechanical heart valve

significance differences both between Thr and AThr and between Thr and N (p \ 0.01). Because of this, the skewness of S1 is the best feature to distinguish heart sounds of a patient with mechanical valve thrombosis and normally functioning mechanical heart valve.

15.6 Conclusion and Future Work In this chapter, the skewness and kurtosis of heart sounds of patients with mechanical heart valve thrombosis and normally functioning mechanical heart valve were compared statistically. As a result, the skewness of S1 of mechanical heart valve should perform fairly well in differentiating normally functioning and malfunctioning mechanical heart valve. However, effectiveness of the skewness of S1 to detect malfunctioning mechanical heart valve should be proven with a large patient population. After that, the skewness of S1 of mechanical heart sound signals may be used for analysis of mechanical heart valve sounds with a view to detecting thrombosis formation on mechanical heart valve. Acknowledgments This work was supported by scientific research projects (BAP) coordinating office of Selçuk University.

References 1. Edmunds LH, Clark RE, Cohn LH, Grunkemeier GL, Miller DC, Weisel RD (1996) Guidelines for reporting morbidity and mortality after cardiac valvular operations. J Thorac Cardiovasc Surg 112:708–711 2. Roudaut R, Lafitte S, Roudaut MF, Courtault C, Perron JM, Jaı C et al (2003) Fibrinolysis of mechanical prosthetic valve thrombosis. J Am Coll Cardiol 41(4):653–658 3. Caceres-Loriga FM, Perez-Lopez H, Santos-Gracia J, Morlans-Hernandez K (2006) Prosthetic heart valve thrombosis: pathogenesis, diagnosis and management. Int J Cardiol 110:1–6 4. Roscıtano A, Capuano F, Tonellı E, Sınatra R (2005) Acute dysfunction from thrombosis of mechanical mitral valve prosthesis. Braz J Cardiovasc Surg 20(1):88–90

15

Time Domain Features of Heart Sounds

181

5. Schlitt A, Hauroeder B, Buerke M, Peetz D, Victor A, Hundt F, Bickel C et al (2002) Effects of combined therapy of clopidogrel and aspirin in preventing thrombosis formation on mechanical heart valves in an ex vivo rabbit model. Thromb Res 107:39–43 6. Koller PT, Arom KV (1995) Thrombolytic therapy of left-sided prosthetic valve thrombosis. Chest 108:1683–1689 7. Kaymaz C, Özdemir N, Çevik C, Izgi C, Özveren O, Kaynak E et al (2003) Effect of paravalvular mitral regurgitation on left atrial thrombosis formation in patients with mechanical mitral valves. Am J Cardiol 92:102–105 8. Kim SH, Lee HJ, Huh JM, Chang BC (1998) Spectral analysis of heart valve sound for detection of prosthetic heart valve diseases. Yonsei Med J 39(4):302–308 9. Kim SH, Chang BC, Tack G, Huh JM, Kang MS, Cho BK, Park YH (1994) In vitro sound spectral analysis of prosthetic heart valves by mock circulatory system. Yonsei Med J 35(3):271–278 10. Candy JV, ve Meyer AW (2001) Processing of prosthetic heart valve sounds from anechois tank measurements. 8. International Congress on Sound and Vibration. China 11. Grigioni M, Daniele C, Gaudio CD, Morbiducci U, D’avenio G, Meo DD, Barbaro V (2007) Beat to beat analysis of mechanical heart valves by means of return map. J Med Eng Technol 31(2):94–100 12. Sava HP, Bedi R, McDonnell TE (1995) Spectral analysis of carpentier-edwards prosthetic heart valve sounds in the aortic position using svd-based methods. Signal Process Cardiogr IEE Colloq 6:1–4 13. Sava HP, McDonnell JTE (1996) Spectral composition of heart sounds before and after mechanical heart valve imdantation using a modified forward-backwar d Prony’s method. IEEE Trans Biomed Eng 43(7):734–742 14. Altunkaya S, Kara S, Görmüsß N, Herdem S (2010) Statistically evaluation of mechanical heart valve thrombosis using heart sounds. Lecture notes in engineering and computer science: Proceedings of the World Congress on Engineering 2010, WCE 2010, 30 June–2 July, 2010, London, UK, 704–708 15. Pan J, Tompkins WJ (1985) A real-time QRS detection algorithm. IEEE Trans Biomed Eng 32(3):230–236 16. Köhler BU, Hennig C, Orglmeister R (2002) The principles of software QRS detection. IEEE Eng Med Biol Mag 21(2):42–57 17. Syed Z, Leeds D, Curtis D, Nesta F, Levine RA, Guttag J (2007) A framework for the analysis of acoustical cardiac signals. IEEE Trans Biomed Eng 54(4):651–662 18. Choi S, Jiang Z (2008) Comparison of envelope extraction algorithms for cardiac sound signal segmentation. Expert Syst Appl 34(2):1056–1069 19. El-Segaier M, Lilja O, Lukkarinen S, Ornmo LS, Sepponen R, Pesonen E (2005) Computerbased detection and analysis of heart sound and murmur. Ann Biomed Eng 33(7):937–942 20. Sanei S, Chambers JA (2007) EEG signal processing. Wiley, Chichester, pp 50–52 21. Press WH, Flannery BP, Teukolsky SA, Vetterling WT (1993) Numerical recipes in C: the art of scientific computing. Cambridge University Press, Cambridge, pp 610–615

Chapter 16

On the Implementation of Dependable Real-Time Systems with Non-Preemptive EDF Michael Short

Abstract Non-preemptive schedulers remain a very popular choice for practitioners of resource constrained real-time embedded systems. This chapter is concerned with the non-preemptive version of the Earliest Deadline First algorithm (npEDF). Although several key results indicate that npEDF should be considered a viable choice for use in resource-constrained real-time systems, these systems have traditionally been implemented using static, table-driven approaches such as the ‘cyclic executive’. This is perhaps due to several popular misconceptions regarding the basic operation, optimality and robustness of this algorithm. This chapter will attempt to redress this balance by showing that many of the supposed ‘problems’ attributed to npEDF can be easily overcome by adopting appropriate implementation and analysis techniques. Examples are given to highlight the fact that npEDF generally outperforms other non-preemptive software architectures when scheduling periodic and sporadic tasks. The chapter concludes with the observation that npEDF should in fact be considered as the algorithm of choice for such systems.

16.1 Introduction This chapter is concerned with the non-preemptive scheduling of recurring (periodic/sporadic) task models, with applications to resource-constrained, singleprocessor real-time and embedded systems. In particular, it is concerned with scheduler architectures, consisting of a small amount of hardware (typically a timer/interrupt controller) and associated software. In this context, there are two M. Short (&) Electronics & Control Group, Teesside University, Middlesbrough, TS1 3BA, UK e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_16, Springer Science+Business Media B.V. 2011

183

184

M. Short

Fig. 16.1 Aspects of real-time embedded scheduling

main requirements of a scheduler. The first is task activation, which is the process of deciding at which points in time tasks become ready for execution. Periodic tasks are normally activated via a timer; event driven (sporadic) tasks can be either directly activated by interrupts or by polling an interrupt status flag. The second is task dispatching, which is the process of deciding which of the active tasks is best to execute, and some form of scheduling algorithm is normally required to achieve this. These two main aspects of scheduling are illustrated in Fig. 16.1. The performance of scheduling algorithms and techniques is an area worthy of study; the seminal paper of Liu and Layland [1], published in 1973, spawned a multitude of research and a significant body of results can now be found in the literature. Liu and Layland were the first to discuss deadline-driven scheduling techniques in the context of real-time and embedded computing. It is known that when task preemption is allowed, this technique—also known as Earliest Deadline First (EDF)—allows the full utilization of the CPU, and is optimal on a single processor under a wide variety of different operating constraints [1–3]. However, for developers of systems with severe resource constraints, preemptive scheduling techniques may not be viable; the study of non-preemptive alternatives is justified for the following (non-exhaustive) list of reasons [4–7]: • Non-preemptive scheduling algorithms are easier to implement than their preemptive counterparts, and can exhibit dramatically lower runtime overheads; • Non-preemptive scheduling naturally guarantees exclusive access to resources, eliminating the need for complex resource access protocols; • Task sets under non-preemptive scheduling can share a common stack and processor context, leading to vastly reduced memory requirements; • Cache and pipeline-related flushing following task preemptions does not occur in non-preemptive systems; • Implementation of overload detection and recovery methods can be easier to implement; • Initial studies seem to indicate that non-preemptive systems are less susceptible to transient errors than their preemptive counterparts.

16

On the Implementation of Dependable Real-Time Systems

185

Along with these advantages, non-preemptive scheduling is also known to have several associated disadvantages; task response times are generally longer, eventdriven (sporadic) task executions are not as well supported, and when preemption is not allowed, many scheduling problems become either NP-Complete or NP-Hard [8]. This work is concerned with systems implementing the non-preemptive version of EDF (npEDF). The main motivating factors for the work are as follows. Although the treatment of npEDF has been (comparatively) small in the literature, several key results exist that indicate npEDF can overcome most (perhaps not all) of the problems associated with non-preemption; as such it should be considered as a viable choice for use in resource-constrained real-time and embedded systems. However, such systems have traditionally been implemented using static, table-driven approaches such as the ‘cyclic executive’ and its variants (see, for example, [4, 9–11]). This is perhaps due to several popular misconceptions1 with respect to the basic operation, implementation complexity, optimality and robustness of the npEDF algorithm, leading to a general lack of coverage in the wider academic community. This chapter will attempt to redress this balance by arguing the case for npEDF, and showing that the supposed ‘problems’ commonly attributed to it either simply do not hold, or can easily be overcome by adopting an appropriate implementation and by applying simple off-line analysis techniques. The chapter is organized as follows. Section 16.2 considers why npEDF seems to be ‘missing’ from most major texts on real-time systems. Section 16.3 presents the assumed task model, and gives a basic description of npEDF. This section also identifies and expands a list of its common criticisms. Section 16.4 subsequently addresses each of these criticisms in turn, to establish their validity (or otherwise). Section 16.5 concludes the chapter.

16.2 npEDF: A Missing Algorithm? In most of the major texts in the field of real-time systems, npEDF does not get more than a passing mention. For example, analysis of non-preemptive scheduling is typically restricted to the use of ‘cyclic executives’ or ‘timeline schedulers’. In almost all cases, after problems have been identified with such scheduling models, attention is then focused directly on Priority-Driven Preemptive (PDP) approaches as a ‘cure for all ills’. For example, Buttazzo [5] discusses timeline scheduling in C4 of his (generally) well-respected book on hard real-time computing systems, concluding with a list of problems associated with this type of scheduling. On p78—immediately before moving onto descriptions of PDP algorithms—it is stated that ‘‘The problems outlined above of timeline scheduling can be solved by using priority-based [preemptive] algorithms.’’ 1

The key results for npEDF—and their implications—are comparatively more difficult to interpret that for other types of scheduling; for example, many previous works assume the reader possesses an in-depth understanding of formal topics in computer science, such as computational complexity.

186

M. Short

Liu takes a similar approach in what is perhaps the most widely-acclaimed book in this area (Real-Time Systems) [12]. Cyclic scheduling is discussed in C5 of her book, ending with a list of associated problems on p122. In each case, it is stated that a PDP system can overcome the problem. This type of argument is by no means limited to reference texts. Burns et al. [9] describe (in-depth) some techniques that can be used for generating feasible cyclic or timeline schedules, followed by a discussion of the problems associated with this type of scheduling, directly followed by a final section (p160) discussing ‘‘Priority [-based preemptive] scheduling as an alternative to cyclic scheduling’’. Whilst it is clearly untrue to say these statements are false, as stated above PDP scheduling is not without its own problems; the next section will examine the basics of npEDF, and examine why it seems to have been overlooked.

16.3 Task Model and Preliminaries 16.3.1 Recurring Computational Tasks This work is concerned with the implementation of recurring/repeated computations on a single processor, such as those that may be required in signal processing and control applications. Such a system may be represented by a set s of n tasks, where each task ti [ s is represented by the tuple: ti ¼ ðpi ; ci ; di Þ

ð16:1Þ

In which pi is the task period (minimum inter-arrival time), ci is the (worst-case) computation requirement of the task and di is the task (relative) deadline. This model was introduced by Liu and Layland [1] and has since been widely adopted—see, for example, [2–7]. Note that it can be assumed w.l.o.g. that time is discrete, and all task parameters can be assumed to be integer [13]. Although implicit deadline tasks (i.e. those in which di = pi) are most commonly discussed in the literature (and employed in practice), no specific relationships between periods and deadlines are assumed to fully generalize the work. Note that periodic tasks may additionally be described by an addition parameter, an initial release time (or offset phasing) ri. Finally, the system utilization U represents the fraction of time the processor will be occupied processing the jobs in the task set over its P lifetime, and is defined as U = ci/pi.

16.3.2 npEDF Algorithm Operation The npEDF algorithm may be described, in simple terms, as follows: 1. When selecting a task for execution, the task with the earliest deadline is selected first (and then run to completion).

16

On the Implementation of Dependable Real-Time Systems

187

Fig. 16.2 Example schedule generated by npEDF

2. Ties between tasks with identical deadlines are broken by selecting the task with the lowest index. 3. Unless the processor is idle, scheduling decisions are only made at task boundaries. 4. When the scheduler is idle, the first task to be invoked is immediately executed (if multiple tasks are simultaneously invoked, the task with the earliest deadline is selected). This simple (but deceptively effective) algorithm may be implemented using only a single hardware timer for periodic tasks. The algorithm clearly differs from the static table-driven approaches, in that the schedule is built on-line, and there is therefore no concept of a fixed time ‘frame’ or ‘tick’. An example schedule which is built by npEDF for the set of synchronous tasks s = [(4, 1, 4), (6, 2, 6), (12, 3, 12)] is shown in Fig. 16.2.

16.3.2.1 npEDF: Common Criticisms As mentioned in the introduction, generally due to misconceptions (or misinterpretations) of its basic operation and use, npEDF is generally seen to be too problematic for use in real systems. The main criticisms that can be found in the literature are listed below: 1. 2. 3. 4.

npEDF is not an optimal non-preemptive scheduling algorithm; npEDF is difficult to analyze, and no efficient schedulability tests exists; npEDF is not ‘robust’ to changes in the task set parameters; Timer rollover can lead to anomalies and deadline misses in an otherwise schedulable task set; 5. The use of npEDF leads to increased overheads (and power consumption) compared to other non-preemptive scheduling techniques. Note that optimal in this sense refers to the ability of npEDF to build a valid schedule, if such a schedule exists. Additionally, robustness refers to the ability of a scheduling algorithm to tolerate run-time reductions in the execution requirement of one (or more) tasks (or, equivalently, increases in period) without deadline misses occurring in an otherwise schedulable task set. Please also note that apart from point 3, this list of criticisms is specific to npEDF, and therefore does not

188

M. Short

include the so-called ‘long-task’ problem which is endemic to all non-preemptive schedulers. This specific problem arises when one or more tasks have a deadline that is less than the execution time of another task. In this situation, effective solutions are known to include code-refactoring at the task level, employing statemachines, or alternately adopting the use of hybrid designs [4, 8, 14]. Such solution techniques easily generalize to npEDF, and are not discussed in any further depth here.

16.4 The Case for npEDF If all of the criticisms given in the previous section are based in fact, then npEDF does not seem a wise choice for system implementation; in fact the contrary would be true. This section will examine each point in greater detail, to investigate if, in fact, each specific claim actually holds.

16.4.1 npEDF is not Optimal As mentioned previously, optimal in this sense refers to the ability of a scheduling algorithm to build a valid schedule for an arbitrary set of feasible tasks, if such a schedule exists. Each (and every) proof that npEDF is sub-optimal relies on a counter-example of the form shown in Fig. 16.3 (taken from Liu [12]—a similar example appears in Buttazzo [5]). It can be seen that despite the existence of a feasible schedule, obtained via the use of a scheduler which inserts idle-time between t = 3 and t = 4 (indicated by the question marks in the figure), the schedule produced by npEDF misses a deadline at t = 12. Since the use of inserted idle-time can clearly have a beneficial effect with respect to meeting deadlines, a related question immediately arises: how complex is a scheduler that uses inserted idle time, and will such a scheduler be of practical use for a real system? The answer, unfortunately, is a resounding no. Two important results were formally shown by Howell and Venkatro [15]. The first is that there cannot be an optimal on-line algorithm using inserted idle-time for the non-preemptive scheduling of sporadic tasks; only non-idling scheduling strategies can be optimal. The second is that an on-line scheduling strategy that makes use of inserted idle-time to schedule non-preemptive periodic tasks cannot be efficiently implemented unless P = NP. It can thus be seen that inserted idle-time is not beneficial when scheduling sporadic tasks, and if efficiency is taken into account, then attention must be restricted to non-idling strategies when scheduling periodic tasks. Efficiency in this sense refers to the amount of time taken by the scheduler to make scheduling decisions; only schedulers that take time proportional to some polynomial in the task set parameters can be considered efficient (a scheduler which takes 50 years to decide the optimal strategy for the next 10 ms is not much

16

On the Implementation of Dependable Real-Time Systems

189

Fig. 16.3 npEDF misses a deadline, yet a feasible schedule exists

practical use). What is known about the non-idling scheduling strategies? These include, for example, npEDF, TTC scheduling [4, 14] and non-preemptive Rate Monotonic (npRM) scheduling [16]. npEDF is known to be optimal among this class of algorithms for scheduling recurring tasks; results in this area were known as early as 1955 [17]. The proof was demonstrated in the real-time context by Jeffay et al. [6] for the implicit deadline case, and extended by George et al. [18] to the constrained deadline case. Thus, the overall claim status: npEDF is sub-optimal for periodic tasks if and only if P = NP, and is optimal for sporadic tasks regardless of the equivalence (or otherwise) of these complexity classes.

16.4.2 No Efficient Schedulability Test Exist for npEDF Consider again the example shown in Fig. 16.3, in which the npEDF algorithm misses a deadline. Why is the deadline missed? At t = 3, only J2 is active and, since the scheduler is non-idling, it immediately begins execution of this task. Subsequently at t = 4, J3 is released and has an earlier deadline—but it is blocked (due to non-preemption) until J2 has run to completion at t = 9. This is known as a ‘priority inversion’ as the scheduler cannot change its mind, once committed. This is highlighted further in Fig. 16.4. Worst-case priority inversions under npEDF scheduling have been investigated in some detail. A relatively simple set of conditions for implicit deadline tasks was derived by Jeffay et al. [6], and were subsequently generalized by George et al. to the arbitrary deadline case [18]. They showed that a set of arbitrary-deadline periodic/sporadic tasks is schedulable under npEDF if and only if all deadlines are met over a specific analysis interval (of length L) following a synchronous arrival sequence of the tasks at t = 0, with the occurrence of worst-case blocking due to non-preemption simultaneously occurring. This situation is depicted in Fig. 16.5,

190

M. Short

Fig. 16.4 Priority inversion due to non-preemption

Fig. 16.5 Worst-case priority inversion induced by task i arriving at t = -1

showing the task with the largest execution time beginning execution one time unit prior to the simultaneous arrival of the other tasks. These conditions can be formalized to obtain a schedulability test, which is captured by the following conditions: U 1:0; 8t; 0\t\L; hbðtÞ t;

ð16:2Þ

Where: hbðtÞ ¼

i¼n X i¼1

t þ pi di max 0; ci þ maxdi [ t fci 1g pi

ð16:3Þ

And: ( L ¼ max d1 ; . . .; dn ;

Pi¼n i¼1

ðpi di Þ Ui 1U

) ð16:4Þ

It should be noted that the time complexity of an algorithm to decide Eqs. 16.2– 16.4 is pseudo-polynomial (and hence highly efficient) whenever U \ 1.0. Other upper bounds on the length on L are derived in [18]. The non-preemptive scheduling problem, in this formulation, turns out to be only weakly coNPComplete. When compared to feasibility tests for other non-preemptive scheduling disciplines, this is significantly better. For example, it is known that deciding if a

16

On the Implementation of Dependable Real-Time Systems

191

set of periodic process can be scheduled by a cyclic executive or timeline scheduler is strongly NP-Hard [8, 9]; it is also known that deciding if a set of periodic process can be scheduled by a TTC scheduler is strongly coNP-Hard2 [19]. Note that strong and weak complexity results have a precise technical meaning; specifically, amongst other things the former rules out the prospect of a pseudo-polynomial time algorithm unless P = NP, whereas the latter does not. Thus, although a very efficient algorithm may be formulated to exactly test for Eqs. 16.2–16.4, it is thought that no exact algorithm can ever be designed to efficiently test schedulability for these alternate scheduling policies. Overall claim status: npEDF admits an efficient feasibility test for periodic (sporadic) tasks that ensures even worst-case priority inversions do not lead to deadline misses.

16.4.3 npEDF is not Robust to Reductions in System Load With respect to this complaint, Jane Liu presents some convincing evidence on p. 73 of her book Real-Time Systems [12], and cites the seminal paper by Graham [20] investigating timing anomalies. There are two principal problems here. The paper by Graham deals only with the multiprocessor case; specifically, it investigates the effects of reduced (aperiodic) task execution times on the makespan produced by the LPT heuristic scheduling technique. As do the examples on p. 73 of Liu’s book, although it is not made explicitly clear. With respect to singleprocessor scheduling, these examples simply do not apply; the only single processor timing anomaly referred to in the Liu text is reproduced in Fig. 16.6; at first glance, it seems that a run-time reduction in the execution requirement of job C1 does, indeed, lead to a deadline miss of J3. However upon closer inspection, this example can be seen to be almost identical to the example given in Fig. 16.3, with the execution of J1 between t = 3 and t = 4 effectively serving the same purpose as the inserted idle-time in Fig. 16.3. In order for this example to hold up, it must logically follow that the schedule must be provably schedulable when the tasks have nominal parameters given by A); applying Eqs. 16.2–16.4 to these tasks, it can be quickly determined that the task set is not deemed to be schedulable, since the formulation of Jeffay’s feasibility test takes worst-case priority inversion into account. This example is misleading w.r.t. npEDF—since the task set simply fails the basic feasibility test, Liu’s argument of ‘an otherwise schedulable task set’ becomes a non-starter. This again highlights the fact that misconceptions regarding robustness and priority inversions have principally arisen from one simple fact; as shown in the previous section, the worst case behavior of a task set—its critical 2

In fact, this situation is known to considerably worse than this. The problem is actually known to be NPNP-Complete [19]. Under the assumption that P = NP, this means that the feasibility test requires an exponential number of calls to a decision procedure which is itself strongly coNPComplete.

192

M. Short

Fig. 16.6 Evidence for a lack of npEDF robustness?

Fig. 16.7 Converting from an absolute (left) to a modular (right) representation of time

instants—under non-preemptive scheduling is not the same as under preemptive scheduling. Overall claim status: If appropriate (off-line) analysis is performed to confirm the schedulability of a task set, this task set will remain schedulable under npEDF even under conditions of reduced system load.

16.4.4 Timer Rollover Can Lead to Deadline Misses With respect to this complaint, this can in fact be shown to hold, but is easily solved. The assumption that time is represented as integer—and in embedded systems, normally with a fixed number of bits (e.g. 16)—eventually leads to timer rollover problems; deadlines will naturally ‘wrap around’ due to the modular representation of time. Since the normal laws of arithmetic no longer hold, it cannot be guaranteed that di mod(2b) \ dj mod(2b) when di \ dj and time is represented by b-bit unsigned integers. There are several efficient techniques that may be used to overcome this problem, perhaps the most efficient is as follows. Assuming that the inequality pm \ 2b/2 holds over a given task set, i.e. the maximum period is less than half the linear life time of the underlying timer, then the rollover problem may be efficiently overcome by using Carlini and Buttazzo’s Implicit Circular Timer Overflow Handler (ICTOH) algorithm [21]. The algorithm has a very simple code implementation, and is show as C code in Fig. 16.7.

16

On the Implementation of Dependable Real-Time Systems

193

Fig. 16.8 Density of scheduling events for both TTC and npEDF scheduling

The algorithm’s operation exploits the fact that the modular distance between any two events (e.g. deadlines or activation times) x and y, encoded by b-bit unsigned integers, may be determined by performing a subtraction modulo 2b between x and y, with the result interpreted as a signed integer. Overall claim status: rollover is easily handled by employing algorithms such as ICTOH.

16.4.5 npEDF Use Leads to Large Scheduling Overheads In order to shed more light on this issue, let us consider the required number of ‘scheduling events’ over the hyperperiod (major cycle) of a given periodic task set, and also the complexity—the required CPU iterations, as a function of the task parameters—of each such event. Specifically, let us consider these scheduling events as required for task sets under both npEDF and TTC scheduling. TTC scheduling is considered as the baseline case in this respect, as it has previously been argued that a TTC scheduler provides a software architecture with minimal overheads and resource requirements [4, 7, 14]. With npEDF, one scheduling event is required for each and every task execution, and the scheduler enters idle mode when all pending tasks are executed. It can be woken by an interrupt set to match the earliest time at which a new task will be invoked. The TTC algorithm is designed to perform a scheduling event at regular intervals, in response to periodic timer interrupts; the period of these interrupts is normally set to be the greatest common divisor of the task periods [4, 14]. Let the major cycle h of a set of synchronous tasks be given by h = lcm(p1, p2, …, pn). The number of scheduling events occurring in h for both the TTC scheduler—SETTC—and the npEDF scheduler—SEEDF—are given by: SETTC ¼

X lcmðp1 ; p2 ; . . .; pn Þ lcmðp1 ; p2 ; . . .; pn Þ ; ; SEEDF ¼ gcdðp1 ; p2 ; . . .; pn Þ pi i2s

ð16:5Þ

Clearly SEEDF B SETTC in almost all cases, and an example to highlight this is shown for the task set s = [(90, 5), (100, 5)] in Fig. 16.8, where scheduling events

194

M. Short

Fig. 16.9 CPU overheads vs. number of tasks

are indicated by the presence of up-arrows on the timeline. Also of interest are the time complexities of each scheduling event. Given the design of the TTC scheduler, it is clear from its implementation (see, for example, [4, 14]) that its complexity is O(n). Task management in the npEDF scheduler significantly improves upon this situation; it is known that the algorithm can be implemented with complexity O(log n) or better, in some cases O(1) [22]. To further illustrate this final point, Fig. 16.9 shows a comparison of the overheads incurred per scheduling event as the number of tasks was increased on a 72-Mhz ARM7-TDMI microcontroller. Overhead execution times were extracted using the technique described in [22]. This graph clearly shows the advantages of the npEDF technique, and for n [ 8 the overheads become increasingly smaller. Overall claim status: With an appropriate implementation, the density of npEDF scheduling events is significantly better than competing methods; the CPU overheads incurred at each such event are also significantly lower.

16.5 Conclusions This chapter has considered the non-preemptive version of the Earliest Deadline First algorithm, and has investigated the supposed problems that have been attributed to this form of scheduling technique. Examples and analysis have been given to show that these problems are either baseless or trivially solved, and in most cases npEDF outperforms many other non-preemptive software architectures. As such, it the conclusion of the current chapter that npEDF should be considered

16

On the Implementation of Dependable Real-Time Systems

195

as one of the primary algorithms for implementing resource-constrained real-time and embedded systems. A preliminary version of the work described in this chapter was presented at the World Congress on Engineering, July 2010 [23].

References 1. Liu J, Layland J (1973) Scheduling algorithms for multiprogramming in a hard real-time environment. J ACM 20(1):46–61 2. Coffman E Jr (1976) Introduction to deterministic scheduling theory, in computer and jobshop scheduling theory. Wiley, New York 3. Dertouzos ML (1974) Control robotics: the procedural control of physical processes. Inf Process 74 4. Pont M (2001) Patterns for time-triggered embedded systems. ACM Press/Addison-Wesley Education, New York 5. Buttazzo GC (2005) Hard real-time computing systems: predictable scheduling algorithms and applications. Spinger, New York 6. Jeffay K, Stanat D, Martel C (1991) On non-preemptive scheduling of periodic and sporadic tasks. In: Proceedings of the IEEE Real-Time Systems Symposium 7. Short M, Pont M, Fang J (2008) Exploring the impact of pre-emption on dependability in time-triggered embedded systems: a pilot study. In: Proceedings of the 20th Euromicro Conference on Real-Time Systems (ECRTS 2008), Prague, Czech Republic, pp 83–91 8. Garey MR, Johnson DS (1979) Computers and intractability: a guide to the theory of NP-completeness. W.H. Freeman & Co Ltd, New York 9. Burns A, Hayes N, Richardson M (1994) Generating feasible cyclic schedules. Control Eng Pract 3(2):151–162 10. Baker TP, Shaw A (1989) The cyclic executive model and Ada. Real-Time Syst 1(1):7–25 11. Locke CD (1992) Software architecture for hard real-time applications, cyclic executives vs. fixed priority executives. Real-Time Syst 4(1):37–52 12. Liu JWS (2000) Real-time systems. Prentice-Hall, New Jersey 13. Baruah S, Rosier L, Howell R (1991) Algorithms and complexity concerning the preemptive scheduling of periodic, real-time tasks on one processor. Real-Time Syst 2(4):301–324 14. Gendy AK, Pont MJ (2008) Automatically configuring time-triggered schedulers for use with resource-constrained, single-processor embedded systems. IEEE Trans Ind Inform 4(1):37–45 15. Howell R, Venkatro M (1995) On non-preemptive scheduling of recurring tasks using inserted idle times. Inf Comput 117:50–62 16. Park M (2007) Non-preemptive fixed priority scheduling of hard real-time periodic tasks. Lect Notes Comput Sci 4990:881–888 17. Jackson JR (1955) Scheduling a production line to minimize maximum tardiness. Research report 43, Management Science Research Project, University of California, Los Angeles, USA 18. George L, Rivierre N, Supri M (1996) Preemptive and non-preemptive real-time uniprocessor scheduling. Research report RR-2966, INRIA, Le Chesnay Cedex, France 19. Short M (2009) Some complexity results concerning the non-preemptive ‘thrift’ cyclic scheduler. In: Proceedings of the 6th International Conference on Informatics in Control, Robotics and Automation (ICINCO 2009), Milan, Italy, July 2009, pp 347–350 20. Graham RL (1969) Bounds on multiprocessing timing anomalies. SIAM J Appl Math 17:416–429 21. Carlini A, Buttazzo GC (2003) An efficient time representation for real-time embedded systems. In: Proceedings of the ACM Symposium on Applied Computing (SAC 2003), Florida, USA, March 2003, pp 705–712

196

M. Short

22. Short M (2010) Improved task management techniques for enforcing EDF scheduling on recurring task sets. In: Proceedings of the 16th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS 2010), Stockholm, Sweden, April 2010, pp 56–65 23. Short M (2010) The case for non-preemptive, deadline-driven scheduling in real-time embedded systems. In: Lecture notes in engineering and computer science: Proceedings of the World Congress on Engineering 2010 (WCE 2010), vol 1. London, UK, 30 June–2 July 2010, pp 399–404

Chapter 17

Towards Linking Islands of Information Within Construction Projects Utilizing RF Technologies Javad Majrouhi Sardroud and Mukesh Limbachiy

Abstract Modern construction management require real-time and accurate information for sharing among all the parties involved to undertake efficient and effective planning as well as execution of the projects. Research projects conducted during the last decade have concluded that information management is a critical factor in construction project performance and plays an essential role in managing the construction where projects need to be completed within a defined budget and deadline. Recently, wireless sensor technologies have matured and become both technically and economically feasible and viable. This research investigates a framework for integrating the latest innovations in Radio Frequencies (RF) based information management system to automate the task of collecting and sharing of detailed and accurate information in an effective way throughout the actual construction projects. The solution presented here is intended to extend the use of a cost-effective and easy-to-implement system (Radio Frequency Identification (RFID), Global Positioning System (GPS), and Global System for Mobile Communications (GSM)) to facilitate low-cost and networkfree solutions for obtaining real-time information and information sharing among the involved participants of the ongoing construction projects such as owner, consultant, and contractor.

J. M. Sardroud (&) Faculty of Engineering, Central Tehran Branch, Islamic Azad University, Tehran, Iran e-mail: [email protected] M. Limbachiy School of Civil and Construction Engineering, Kingston University London, Kingston upon Thames, London, KT1 2EE, UK e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_17, Ó Springer Science+Business Media B.V. 2011

197

198

J. M. Sardroud and M. Limbachiy

17.1 Introduction Construction is identified internationally as one of the information-intensive industry which subject to open environment and survive harsh conditions [1, 2]. Due to the complex, unprepared, and uncontrolled nature of the construction site, not only using of automated advanced tracking and data storage technologies for efficient information management is needed but also construction industry has greatly benefited from technology in rising the speed of information flow, enhancing the efficiency and effectiveness of information communication, and reducing the cost of information transfer [3]. Missing and delayed information access constitutes 50–80% of the problems in construction. One of the major sources of information is the data collected on construction sites. Even though, collection of detailed, accurate and a sufficient volume of information and timely delivery of it is vital for effective construction management, the current situations of on-site information management methods are manually on the human ability using paper and pencil in all parts of the construction phase [4]. Previous observations on construction sites cite that 30–50% of the field supervisory personnel’s time is spent on recording and analyzing field data [5] and 2% of the work on construction sites is devoted to manual tracking and recording of progress data [6]. Data collected using manual methods are not reliable or complete due to reluctance of workers to monitor and record the flow of large quantities of elements. Data collected through these methods are usually transferred and stored in paperbased format, which is difficult to search and access, and makes processing data into useful information expensive and unreliable. Thus, some information items end up being unavailable to the parties who need access to them in a timely manner to make effective decisions [7]. Effective and immediate access to information minimizes the time and labour used for retrieving information related to each part of construction and reduces the occurrence of ineffective decisions that are made in the absence of information [8]. The process of capturing quantity of work data at a construction site needed to be improved in terms of accuracy and completeness to eliminate unnecessary communication loops and secondary tasks caused by missing or inaccurate data. These all suggest the need for a fully automatic data collection technology to capture the status information throughout construction and to integrate this data in a database automatically. The emergence of ubiquitous system which is developed in this research has the potential to enlarge the boundary of information systems from the actual work sites to the site offices and ensure real time data flow among all participants of construction projects. This research investigates the fully automated data collection using integrated applications of Radio Frequency Identification (RFID), Global Positioning System (GPS) and Global System for Mobile Communication (GSM) technologies in the construction industry which focused on the real-time exchange of information between the on and off construction sites. The system addresses a clear path for obtaining real-time information and information sharing among the involved

17

Towards Linking Islands of Information

199

participants of the construction phase such as owner, consultant, and contractor via the Internet. The solution presented here is intended to extend the use of current technologies RFID, GPS, and GSM to facilitate extremely low-cost and networkfree solutions to form the backbone of an information management system for practical communication and control among construction participants. The remainder of this paper first reviews previous research efforts that have been done by others relating to applications of wireless technologies in construction, followed by an assessment of the enabling technologies which are utilized in this research. Then it reveals the architecture of the integrated system for collecting and sharing real time information in all part of construction phases. Finally conclusions are given in the end.

17.2 Prior Research Efforts Many research projects have focused on the application of wireless technologies in the construction sector [9]. These technologies can be used for tracking, locating, and identifying materials, vehicles and equipment that lead to important changes in managerial tasks in the construction industry [10]. Recent research projects looked on the potential of using wireless technologies in the construction sector to improve the process of capturing data [11–13], some of these are discussed here. Jaselskis et al. have summarized RFID technology and surveyed its potential applications in the construction industry including concrete processing and handling, cost coding of labour and equipment, and material control [14]. Chen et al. conducted an experiment in which Bar-code technique is used to facilitate effective management of construction materials to reduce construction wastes [15]. Jaselskis and El-Misalami implemented RFID to receive and keep tracking of pipe spool in the industrial construction process. Their pilot test demonstrated that RFID could increase operation efficiency by saving time and cost in material receiving and tracking [16]. Oloufa et al. examined the use of differential GPS technology on construction sites to avoiding equipment collisions [17]. Jang and Skibniewski developed an Automated Material Tracking system based on ZigBee localization technology with two different types of query and response pulses [18]. Song et al. developed a system that can identify logistics flow and location of construction materials with better performance by using wireless sensor networks such as ZigBee technologies [19]. Majrouhi Sardroud and Limbachiya investigated the use of RFID technology in construction information delivery and management [4]. In some research efforts, authors have developed RFID based methods to automate the task of tracking and locating of construction materials and components in lay down yards and under shipping portals [20–29] and to improve the efficiency of tracking tools and movement of construction equipment and vehicles on and off construction sites [30–34]. Although, the aforementioned research has proven the value and potential of using wireless technologies, however, the reality is that the use of a cost-effective,

200

J. M. Sardroud and M. Limbachiy

scalable, and easy-to-implement information management system in an effective way at actual construction projects are scarce. This research created a framework for integrating the latest innovations in Automated Data Collection (ADC) technologies such as RFID, GPS, and GSM that address a clear path to automate collecting and sharing of detailed, accurate and a sufficient volume of information throughout the construction phase using minimal or no human efforts.

17.3 Technology Description Recently, wireless sensor technologies have matured and become both technically and economically feasible and viable to potentially support information delivery and management for construction industry. Advanced tracking and data storage technologies such as RFID, GPS, and GSM provide an automated data collection on construction phases and allow all participants to share data accurately, completely, and almost instantly. In recent years, RFID attracts attention as an alternative to the bar code and has been successfully applied to the areas of manufacturing, distribution industry, supply chain, agriculture, transportation, and healthcare [35, 36]. RFID is a method of remotely storing and retrieving data by utilizing radio frequency in identifying, tracking, and detecting various objects [37]. An early, if not the first, work exploring RFID is the landmark paper by Harry Stockman, ‘‘Communication by Means of Reflected Power’’ [38]. A RIFD system consists of tags (transponder) with an antenna, a reader (transceiver) with an antenna, and a host terminal. The RFID reader acts as a transmitter/receiver and transmits an electromagnetic field that ‘‘wakes-up’’ the tag and provides the power required for the tag to operate [3]. A typical RFID system is shown in Fig. 17.1. An RFID tag is a portable memory device located on a chip that is encapsulated in a protective shell and can be attached to any object which stores dynamic information about the object. Tags consist of a small integrated circuit chip coupled with an antenna to enable them to receive and respond to radio frequency queries from a reader. Tags can be

1 3

2 RFID Tag Fig. 17.1 A typical RFID system

RFID Reader

Host Terminal

17

Towards Linking Islands of Information

201

categorized as read-only (RO), write once, read many (WORM), and read-write (RW) in which the volume capacity of their built-in memories varies from a few bits to thousands of bits. RFID tags can be classified into active tags (battery powered) and passive tags, which powered solely by the magnetic field emanated from the reader and hence have an unlimited lifetime. Reading and writing ranges are depend on the operation frequency (low, high, ultra high, and microwave). Low frequency systems generally operate at 124, 125 or 135 kHz. High frequency systems operates at 13.56 MHz and ultra high frequency (UHF) and use a band anywhere from 400 to 960 MHz [39]. Tags operating at ultra high frequency (UHF) typically have longer reading ranges than tags operating at other frequencies. Similarly, active tags have typically longer reading ranges than passive tags. Tags also vary by the amount of information they can hold, life expectancy, recycle ability, attachment method, usability, and cost. Communication distance between RFID tags and readers may decrease significantly due to interferences by steel objects and moisture in the vicinity, which is commonplace in a construction site. Active tags have internal battery source and therefore have shorter lifetime of approximately three to ten years [16]. The reader, combined with an external antenna, reads/writes data from/to a tag via radio frequency and transfers data to a host computer. The reader can be configured either as a handheld or a fixed mount device [40]. The host and software system is an all-encompassing term for the hardware and software component that is separate from the RFID hardware (i.e., reader and tag); the system is composed of the following four main components: Edge interface/system, Middleware, Enterprise back-end interface, and Enterprise back end [14]. RFID tags are more durable and suitable for a construction site environment in comparison with Barcodes which are easily peeled off and may be illegible when they become dirty. RFID tags are not damaged as easily and do not require line-of sight for reading and writing, they can also be read in direct sunlight and survive harsh conditions, reusable, and permit remote [4]. According to the shape of assets, RFID tag can be manufactured all kinds of shapes to adapt all kinds of assets [41]. GPS is a Global Positioning System based on satellite technology. The activities on GPS were initiated by the US Department of Defence (DOD) in the early 1970s under the term Navigation Satellite Timing and Ranging System (NAVSTAR). Glonass, Galileo, and BeiDou are Russian, European Union, and Chinese Global Positioning Systems, respectively [42, 43]. GPS consists of nominally 24 satellites that provide the ranging signals and data messages to the user equipment [44]. To calculate locations, the readings from at least four satellites are necessary, because there are four parameters to calculate: three location variables and the receiver’s time [45]. To get metric or sub metric accuracy in positioning data (i.e. longitude, latitude, and altitude), a single GPS receiver is not sufficient; instead a pair of receivers perform measurements with common satellites and operate in a differential mode. DGPS provides sufficient accuracy for most outdoor tracking applications. In DGPS two receivers are used. One receiver measures the coordinates of a stationary point, called the base, whose position is perfectly known in the reference geodetic system used by GPS. The 3-D deviation between the

202

J. M. Sardroud and M. Limbachiy

measured and actual position of the base, which is roughly equal to the measurement error at a second receiver at an unknown point (called ‘‘rover’’), is used to correct the position computed by the latter [46]. GSM is a worldwide standard for cellular communications. The idea of cellbased mobile radio systems appeared at Bell Laboratories in the early 1970s. In 1982 the Conference of European Posts and Telecommunications formed the Groupe Spécial Mobile (GSM) to develop a pan-European mobile cellular radio system (the acronym later became Global System for Mobile communications). One of the current available technologies for mobile data transfer is General Packet Radio Systems (GPRS). GPRS is a packet switched ‘‘always on’’ technology which allows data to be sent and received across a mobile telephone network almost instantly, so immediacy is one of the advantages of GPRS [47].

17.4 Architecture of the Proposed System

RFID antenna

Micro Controller GSM

Motion Sensor

RFID Reader

Battery

External Sensors

Memory

GPS LED indicators

Fig. 17.2 A schematic model of U-Box

GSM antenna GPS antenna

The RFID-based ubiquitous system (U-Box) utilized in this research is combination of GPS, RFID and GSM, and as such, takes advantage of the respective strengths of each. The system could be divided into two major parts, mobile system and central station. The Mobile system mainly consists of three types of hardware components; namely, (i) GPS technology; (ii) RFID technology where passive High Frequency (HF) and Ultra High Frequency (UHF) band RFID tags is used for identifying and obtaining the object/user related information by using an RFID reader which is plugged into the mobile system; and (iii) GSM communication technology where the information (ID, specific information and date) retrieved from RFID readers and GPS is transferred to the server using GPRS or SMS. The central station consists of two servers, the application server (portal system) and the database server (project database). In this approach, data collection is done continuously and autonomously, therefore, the RFID as a promising technology is the solution for the information collection problems and the portal with GSM technology is used to solve the information communication problem in the construction industry. A schematic model of U-Box is shown in Fig. 17.2.

17

Towards Linking Islands of Information

203

As it can be seen in the block diagram, the device has a rechargeable internal battery and a motion sensor. Micro controller checks the source of power. If it’s still connected, micro controller sends a command to controller module to recharging the battery. Also, it has its own internal memory to store information (Lat, Long, Data and Time, and etc.), ability to save information when it lost GSM network, and sending saved information immediately after registering in the network. Users can attach some external sensors to U-Box so micro controller will get information of sensors and store them in internal memory then will send to data centre via defined link. In identification segment, selected RFID technology for any moving probes is active RFID where an independent power supply active RFID tags allow greater communication ranges, higher carrier frequencies, greater data transmission rates, better noise immunity, and larger data storage capacity. In positioning segment the GPS unite will be durable to function in open air conditions. The GPS receiver had a nominal accuracy of 5 m with Wide Area Augmentation System (WAAS). In data transmission segment GPRS and SMS has been selected to support the data transmission between the U-Box and the central office. GPRS connected to the GSM network via SIM card for data transfer enables several new applications which have not previously been available over GSM networks due to the limitations in message length of SMS (160 characters) such as Multimedia Messaging. In this approach, on site data collection begins with RFID tags that contain unique ID numbers and carries data on its internal memory about the host such as item specific information. It can be placed on any object/user such as materials or workers. During the construction process and at the times of moving any object, the information on the RFID tag is captured and deciphered by the RFID reader which is connected to the mobile system and indeed the micro controller gets information of GPS (which is part of U-Box) and stores the location of the object/user. The ID and location information of the object/user is then sent to a database using GSM technology. In this approach, the tags are used only for identification, and all of the related information is uploaded and stored in one or more databases which will be indexed with the same unique ID of objects. In another mechanism, information can be stored directly on the RFID tags locally and not to store any data in the server. Information update and announcement is synchronously sent via the portal and the system will effectively increase the accuracy and speed of data entry by providing owners, consultants, and contractors with the real time related information of any object/user. The application server defines various applications for collecting, sharing, and managing information. Any moving probes, such as materials handling equipment (top-slewing and bottom-slewing tower cranes, truck-mounted mobile cranes, and crawler), hoists, internal and external delivery vehicle, the gates and some key workers should be equipped with the U-Box. This intelligent system could be programmed to send back information via SMS when RFID reader or user defined sensors which are connected to the system receive new data, for example from uploaded component to the truck or detected data by sensors. Collected data will be used in the application side by using a web-based portal system for information sharing among all participants. Electronic exchange of

204

J. M. Sardroud and M. Limbachiy

collected information leads to reduction of errors and improved efficiency of the operation processes. The portal system provides an organization with a single, integrated database, both within the organization and among the organizations and their major partners. With the portal system and its coupled tools, managers and workers of each participant can conduct valuable monitor and controlling activities throughout the construction project. For instance, information is transmitted back to the engineering office for analysis and records, enables the generating of reports on productivity where this up-to-date information about construction enables effective management of project. One of the challenges of designing an effective construction information management system is designing an effective construction tagging system. Each RFID tag is equipped with a unique electronic identity code which is usually the base of reports that contain tracking information for a particular user/object. In choosing the right RFID tag for any application, there are a number of considerations, including: frequency range, memory size, range performance, form factor, environmental conditions, and standards compliance. To minimize the performance reduction of selected technology in contact with metal and concrete, RFID tags need to be encapsulated or insulated. Extremely heavy foliage or underground places like tunnels would cause the signal to fade to an extent when it can no longer be heard by the GPS or GSM antenna. When this happen, the receiver will no longer know its location and the in the case of an intelligent system application, the vehicle is technically lost and central office won’t receive information from this system. In this case to locate vehicles inside GPS blind areas, intelligent system will use RFID reader to save tag-IDs in the way through the tunnel—each tag-id shows a unique location—the device will store all information inside internal memory as a current position, and the system will send unsent data to central office when network re-established. In this research, a geo-referenced map of the construction job site should be created once, and then it will be used to identify locations of the objects/users by comparing the coordinates received from the GPS with those in the geo-referenced map.

17.5 Conclusions Proposed system is an application framework of RFID-based automated data collection technologies which focuses on the real-time collection and exchange of information among the all participants of construction project, construction site and off-site office. This system can provide low-cost, timely, and faster information flow with greater accuracy by using RFID technology, GPS, GSM, and a portal system. In this research data collection is done continuously and autonomously, therefore, the combination of selected Radio Frequencies (RF) based information and communication technologies as a powerful portable data collection tool enables collecting, storing, sharing, and reusing field data accurately, completely, and almost instantly. In this manner up-to-date information

17

Towards Linking Islands of Information

205

regarding all parts of construction phase is available which permits real-time control enabling corrective actions to be taken. The system enables collected information to be shared among the involved participants of the construction phase via the Internet which leads to important changes in the construction project control and management. The proposed system has numerous advantages. It is automatic, thus reducing the labour costs and eliminating human error associated with data collection during the processes of construction. It can dramatically improve the construction management activities which also lead to keep cost and time under control in the construction phase. The authors believe that, in practice, the approached pervasive system can deliver a complete return on investment within a short period by reducing operational costs and increasing workforce productivity.

References 1. Bowden S, Dorr A, Thorpe T, Anumba C (2006) Mobile ICT support for construction process improvement. Autom Constr 15(5):664–676 2. Behzadan H, Aziz Z, Anumba CJ, Kamat VR (2008) Ubiquitous location tracking for context-specific information delivery on construction sites. Autom Constr 17(6):737–748 3. Wang LC, Lin YC, Lin PH (2007) Dynamic mobile RFID-based supply chain control and management system in construction. Adv Eng Inform 21(4):377–390 4. Majrouhi Sardroud J, Limbachiya MC (2010) Effective information delivery at construction phase with integrated application of RFID, GPS and GSM technology. Lect Notes Eng Comput Sci 2183(1):425–431 5. McCullouch B (1997) Automating field data collection on construction organizations. In: 5th Construction Congress: Managing Engineered Construction in Expanding Global Markets, Minneapolis, USA 6. Cheok GS, Lipman RR, Witzgall C, Bernal J, Stone WC (2000) Non-intrusive scanning technology for construction status determination. Building and Fire Research Laboratory, National Institute of Standards and Technology, NIST Construction Automation Program Report no. 4 7. Ergen E, Akinci B, Sacks R (2003) Formalization and automation of effective tracking and locating of precast components in a storage yard. In: 9th EuropIA International Conference (EIA-9), E-Activities and Intelligent Support in Design and the Built Environment, Istanbul, Turkey 8. Akinci B, Kiziltas S, Ergen E, Karaesmen IZ, Keceli F (2006) Modeling and analyzing the impact of technology on data capture and transfer processes at construction sites: a case study. J Constr Eng Manag 132(11):1148–1157 9. Majrouhi Sardroud J, Limbachiya MC, Saremi AA (2009) An overview of RFID applications in construction industry. In: Third International RFID Conference, 15–16 August, 2009, Tehran, Iran 10. Majrouhi Sardroud J, Limbachiya MC, Saremi AA (2010) Ubiquitous tracking and locating of construction resource using GIS and RFID. In: 6th GIS Conference & Exhibition, (GIS 88), 6 January 2010, Tehran, Iran 11. Pradhan A, Ergen E, Akinci B (2009) Technological assessment of radio frequency identification technology for indoor localization. J Comput Civ Eng 23(4):230–238 12. Yin SYL, Tserng HP, Wang JC, Tsai SC (2009) Developing a precast production management system using RFID technology. Autom Constr 18(5):677–691

206

J. M. Sardroud and M. Limbachiy

13. Motamedi A, Hammad A (2009) Lifecycle management of facilities components using radio frequency identification and building information model. Electron J Inf Technol Constr 14(2009):238–262 14. Jaselskis EJ, Anderson MR, Jahren CT, Rodriguez Y, Njos S (1995) Radio frequency identification applications in construction industry. J Constr Eng Manag 121(2):189–196 15. Chen Z, Li H, Wong TC (2002) An application of bar-code system for reducing construction wastes. Autom Constr 11(5):521–533 16. Jaselskis EJ, El-Misalami T (2003) Implementing radio frequency identification in the construction process. J Constr Eng Manag 129(6):80–688 17. Oloufa AA, Ikeda M, Oda H (2003) Situational awareness of construction equipment using GPS, wireless and web technologies. Autom Constr 12(6):737–748 18. Jang WS, Skibniewski MJ (2007) Wireless sensor technologies for automated tracking and monitoring of construction materials utilizing Zigbee networks. In: ASCE Construction Research Congress: The Global Construction Community, Grand Bahamas Island 19. Song J, Haas CT, Caldas CH (2007) A proximity-based method for locating RFID tagged objects. Adv Eng Inform 21(4):367–376 20. Song J, Haas CT, Caldas CH (2006) Tracking the location of materials on construction job sites. J Constr Eng Manag 132(9):911–918 21. Caron F, Razavi SN, Song J, Vanheeghe P, Duflos E, Caldas CH, Haas CT (2007) Locating sensor nodes on construction projects. Auton Robot 22(3):255–263 22. Ergen E, Akinci B, Sacks R (2007) Life-cycle data management of engineered-to-order components using radio frequency identification. Adv Eng Inform 21(4):356–366 23. Ergen E, Akinci B, Sacks R (2007) Tracking and locating components in a precast storage yard utilizing radio frequency identification technology and GPS. Autom Constr 16(3):354– 367 24. Yu SN, Lee SY, Han CS, Lee KY, Lee SH (2007) Development of the curtain wall installation robot: performance and efficiency tests at a construction site. Auton Robot 22(3):281–291 25. Tzeng CT, Chiang YC, Chiang CM, Lai CM (2008) Combination of radio frequency identification (RFID) and field verification tests of interior decorating materials. Autom Constr 18(1):16–23 26. Jang WS, Skibniewski MJ (2008) A wireless network system for automated tracking of construction materials on project sites. J Constr Eng Manag 14(1):11–19 27. Torrent DG, Caldas CH (2009) Methodology for automating the identification and localization of construction components on industrial projects. J Comput Civ Eng 23(1):3–13 28. Majrouhi Sardroud J, Limbachiya MC (2010) Improving construction supply chain management with integrated application of RFID technology and portal system. In: The 8th International Conference on Logistics Research (RIRL 2010), Sept. 29–30 and Oct. 1st, 2010, Bordeaux, France 29. Majrouhi Sardroud J, Limbachiya MC (2010) Integrated advance data storage technology for effective construction logistics management. In: 27th International Symposium on Automation and Robotics in Construction (ISARC 2010), June 25–27, 2010, Bratislava, Slovakia 30. Naresh AL, Jahren CT (1997) Communications and tracking for construction vehicles. J Constr Eng Manag 123(3):261–268 31. Sacks R, Navon R, Brodetskaia I, Shapira A (2005) Feasibility of automated monitoring of lifting equipment in support of project control. J Constr Eng Manag 131(5):604–614 32. Goodrum PM, McLaren MA, Durfee A (2006) The application of active radio frequency identification technology for tool tracking on construction job sites. Autom Constr 15(3):292– 302 33. Lee UK, Kang KI, Kim GH, Cho HH (2006) Improving tower crane productivity using wireless technology. Computer-Aided Civ Inf Eng 21(8):594–604 34. Lu M, Chen W, Shen X, Lam HC, Liu J (2007) Positioning and tracking construction vehicles in highly dense urban areas and building construction sites. Autom Constr 16(5):647–656

17

Towards Linking Islands of Information

207

35. Nambiar AN (2009) RFID Technology: A Review of its Applications. Lect Notes Eng Comput Sci 2179(1):1253–1259 36. Huang X (2008) Efficient and reliable estimation of tags in RFID systems. Lect Notes Eng Comput Sci 2169(1):1169–1173 37. Majrouhi Sardroud J, Limbachiya MC (2010) Utilization of advanced data storage technology to conduct construction industry on clear environment. In: International Conference on Energy, Environment, and Sustainable Development (ICEESD 2010), June 28–30, 2010, Paris, France 38. Landt J (2005) The history of RFID. IEEE Potentials 24(4):8–11 39. ERABUILD (2006) Review of the current state of Radio Frequency Identification (RFID) technology, its use and potential future use in construction. National Agency for Enterprise and Construction, Tekes, Formas and DTI, Final Report 40. Lahiri S (2005) RFID sourcebook. IBM Press, Upper Saddle River 41. Su CJ, Chou TC (2008) An radio frequency identification and enterprise resource planningenabled mobile asset management information system. Lect Notes Eng Comput Sci 2169(1):1837–1842 42. Kaplan ED, Hegarty CJ (2006) Understanding GPS, principles and applications. Artech House, Inc., Norwood 43. Xu G (2007) GPS: theory algorithms and applications. Springer, Berlin 44. Kupper A (2005) Location-based services, fundamentals and operation. Wiley, West Sussex 45. French GT (1996) Understanding the GPS—an introduction to the global positioning system. GeoResearch, Inc., Bethesda 46. Peyret F, Betaille D, Hintzy G (2000) High-precision application of GPS in the field of realtime equipment positioning. Autom Constr 9(3):299–314 47. Ward M, Thorpe T, Price A, Wren C (2004) Implementation and control of wireless data collection on construction sites. Electron J Inf Technol Constr (ITcon) 9:297–311

Chapter 18

A Case Study Analysis of an E-Business Security Negotiations Support Tool Jason R. C. Nurse and Jane E. Sinclair

Abstract Active collaboration is undoubtedly one of the most important aspects within e-business. In addition to companies collaborating on ways to increase productivity and cut costs, there is a growing need for in-depth discussion and negotiations on their individual and collective security. This paper extends previous work on a tool aimed at supporting the cross-enterprise security negotiations process. Specifically, our goal in this article is to briefly present a case study analysis and evaluation of the usage of the tool. This provides further real-world insight into the practicality of the tool and the solution model which it embodies.

18.1 Introduction E-business has matured into one of the most cost-efficient and streamlined ways of conducting business. As the use of this new business paradigm thrives however, ensuring adequate levels of security for these service offerings emerges as a critical goal. The need for security is driven by an increasing regulatory and standards requirements base (e.g. EU Data Protection Act and US Sarbanes–Oxley Act) and escalating security threats worldwide (as indicated in [1]). Similar to the businesslevel collaborations necessary to facilitate these interactions, there also needs to be a number of discussions and negotiations on security. A key problem during collaborations however is the complex discussion task that often ensues as J. R. C. Nurse (&) J. E. Sinclair Department of Computer Science, Warwick University, Coventry, CV4 7AL, UK e-mail: [email protected] J. E. Sinclair e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_18, Ó Springer Science+Business Media B.V. 2011

209

210

J. R. C. Nurse and J. E. Sinclair

companies have different security postures, a range of disparate security needs, may have dissimilar laws/regulations which they each subscribe to, have different skill sets/experience levels and so on. Owing to these and other challenges, [2] aptly labels the related process as ‘security mayhem’. With appreciation of the collaboration difficulties highlighted above, particularly in terms of security approaches in Web services-based interactions, in previous work we have presented BOF4WSS, a Business-Oriented Framework for enhancing Web Services Security for e-business [3, 4]. The framework’s novelty stemmed from its concentration on a cross-enterprise development methodology to aid collaborating e-businesses in jointly creating secure and trusted interactions. Additionally, BOF4WSS aims to fit together a majority of the critical pieces of the WS security puzzle (for example, key new approaches such as [5, 6]) to propose a well-rounded, highly structured, extensible framework (framework and methodology being synonymous in the context of this work). Progressing from the BOF4WSS methodology itself, our emphasis has shifted to supplying software to support it and assist in its seamless application to business scenarios. In previous articles we have presented (see [7]) and initially evaluated (see [8]) one of these tools, which was developed to support and ease security negotiations across collaborating e-businesses. In terms of BOF4WSS, this refers specifically to easing the transition from the individual Requirements Elicitation stage to the subsequent joint Negotiations stage. Generally, some of the main problems identified and targeted included, understanding other companies’ security documentation, understanding the motivation behind partnering businesses’ security needs/decisions, and being able to easily match and compare security decisions from entities which target the same situation and risk. Related work in [9, 10] and feedback from interviewed security practitioners supports these issues. Building on previous research, the aim of this paper therefore is to extend initial evaluation work in [8] and pull together the compatibility evaluation of the tool and the Solution model it embodies through the use of a case scenario analysis. This enables a more complete evaluation of the proposals because, unlike the compatibility assessment in [8], it progresses from the initial model stages to the final tool output produced. The scenario contains two companies using two popular system-supported security Risk Management/Assessment (RM/RA) methods. Topics to be considered in this analysis include: how tool data is transferred to the RM/RA approaches/software (as expected in the Solution model [7]); How is typical RM/RA approach information represented in the tool’s common, custombuild XML-based language; and finally, how close, if at all, can the tool bring together the different RM/RA approaches used by companies to ease stage transition within BOF4WSS. If the tool can interplay with a majority of the securityrelated information output from popular RM/RA techniques, its feasibility as a system that can work alongside current approaches used in businesses today, will be evidenced. The next section of this paper reviews the Solution model and resulting tool to support security negotiations across e-businesses. Then, in Sect. 18.3, we give a brief background on the business scenario and begin the case study analysis.

18

A Case Study Analysis of an E-Business Security Negotiations Support Tool

211

Findings are discussed as they are found. Section 18.4 completes this contribution by providing conclusions and outlining directions for future work.

18.2 Solution Model and Tool: A Recap The Solution model is the conceptual base for the software tool developed in our research. It was initially presented in [7] and consists of four component stages. These are: Security Actions Analysis, Ontology Design, Language Definition and Risk Catalogue Creation. Security actions analysis This stage focuses on reviewing the literature in the security risk management field, and critically examining how security actions and requirements are determined. A security action is broadly defined as any way in which a business handles the risk it faces (e.g. ‘insurance will be purchased to protect against very skilled and sophisticated hacker attacks’), and a security requirement is a high-to-medium level desire, expressed to mitigate a risk (e.g. ‘classified information must be encrypted when transferred over a network connection’). The key outcome of this stage is a thorough understanding of the relevant security domain which could then be used as a foundation for future stages. Ontology design The aim of this component is to produce a high-level ontology design using the findings from the previous stage, to establish a common understanding and semantics structure of the security actions (and generally security risk management) domain. This common or shared understanding is a critical prerequisite when considering the difficulties businesses face (because of different terminologies used, RM/RA methods applied, and so on) as they try to understand their partners’ security documentation which is supplied in BOF4WSS’ Negotiations phase. Further detail on the Security Actions Analysis and Ontology Design stages (inclusive of draft ontology) can be seen in [11]. Language definition This stage has two sub-components. First is the development of a XML-based language called Security Action Definition Markup Language (SADML). This allows for the establishment of a common format (based on the ontology) by which security actions/requirements information provided by companies is formally expressed, and also later processed by the resulting tool. Second is the proposal of a user-friendly interface such as a data entry screen or template document by which businesses’ security-related data could be entered, and subsequently marked up in SADML. This interface would act as a guide for companies in prompting them to supply complete information as they prepare to come together for negotiations. Risk catalogue creation This final component stage addresses the problem of matching and comparing security actions/requirements across enterprises by defining a shared risks catalogue. Given that businesses use risks from this shared catalogue as input to their RM/RA methods, regardless of the security actions that they decide individually, the underlying risks could be used by the tool to automatically match their actions. To increase flexibility, the catalogue would feature an extensive and updatable set of security risks.

212

J. R. C. Nurse and J. E. Sinclair

Fig. 18.1 Process flow of implemented solution model

risks (assets, threats, vulnerabilities)

Comp A’s risk management methodology

Risks (assets, threats, vulnerabilities) catalogue New risks exchanged

all security actions & factors motivating them, inclusive of risks, laws, policies, etc.

Comp B’s risk management methodology

...

Data entry & data storage system

Data entry & data storage system

...

all security data available

Encoding system (based on language) Comp A’s encoded security actions & factors motivating them

...

Encoding system (based on language) Comparison system (matching based on risk)

...

(i) User-friendly interface where security actions and the related security risks, are automatically matched and displayed (ii) Inconsistencies flagged that represent exceptional situations and thus should be discussed by personnel

With a recap of the Solution model now provided, Fig. 18.1 shows a process flow of how the implemented model i.e. the tool, works. In this diagram, Comp A and Comp B are companies using BOF4WSS for an online business scenario. To explain the process flow: First, companies would select a set of risks from the catalogue that apply to their particular business scenario, and use these as input to their different RM/RA methodologies. Any new risks to be considered which are not available in the catalogue can be exchanged for this scenario. After companies have used their RM/RA approaches to determine their individual security actions (inclusive of motivational factors), these are then input into the Data entry and storage system. This system uses a user-friendly interface to read in the data (as suggested in the Language Definition stage), and stores it to a back-end database to allow for data retrieval, updating and so on. This interface, and generally the tool, mirror the understanding of concepts defined in the ontology. As companies are about to come together for Negotiations, the Encoding system is used to read security data from the database and encode it into SADML. In the Negotiations stage of BOF4WSS, companies bring their individual SADML documents and these are passed to the tool’s Comparison system. This system matches companies’ security actions based on risks which they address, and aims to provide a user-friendly interface in which (i) security actions can be quickly compared and discussed, (ii) any inconsistencies would be flagged for follow-up by personnel, and (iii) a shared understanding of security terms, risks and so on, will be upheld due to the references that can be made to the ontology. Next, in Sect. 18.3, we conduct the case study analysis to give further insight into the use and practicality of the tool proposed.

18.3 Case Study Analysis The core aim of this section is to complete the compatibility and feasibility evaluation first presented in [8], using a full case study analysis. In that previous work, a very detailed discourse and a number of mappings were presented. Now

18

A Case Study Analysis of an E-Business Security Negotiations Support Tool

213

the objective is to put that and other aspects of the Solution model and tool into a more real-world context. In addition to further supporting the feasibility of this research’s proposals, this would enable a more thorough evaluation of the model as it progressed from the initial Central Risk Catalogue to the final tool output. The case scenario to be used consists of two businesses, Buyer and Supplier. These companies have worked with each other in the past using mainly manual and other offline interactions. To enable their processes to be more integrated and streamlined, the parties are now choosing to use the Internet and WS technology suite for online business-to-business communications. As security is a key priority for companies, they are adopting BOF4WSS to aid in the creation of a secure WS-based business scenario. In line with this paper, the areas of focus are the progression from the Requirements Elicitation stage to the Negotiation stage. This involves the passing and then negotiation on entities’ security needs and requirements. In terms of RM/RA and determining security needs and requirements, EBIOS and CORAS are the two methods used by entities. EBIOS is a risk management approach for assessing and treating risks in the field of information systems security [12]. CORAS is a tool-supported methodology for model-based risk assessment of security-critical systems [13]. Specifically, to analyze risk and determine security actions, Buyer uses EBIOS and its software, whereas Supplier employs CORAS and its supporting tool. Next, we begin the case study analysis. According to the Solution model flow (see Fig. 18.1), regardless of the RM/RA method used, the starting point of the scenario should be a common risks base or catalogue. This point however is where one of the first difficulties in the evaluation surfaced. When the model was first conceived it was assumed that the transferring of common risk data to RM/RA approaches would be done manually. During the completion of this study however, such a process actually proved somewhat tedious. This is especially in terms of accurate and consistent mapping of data from the common risks catalogue to the RM/RA methods and software. If there was a risk to the confidentiality of Web services messages in the Risk Catalogue system therefore, the problem was: how could that data and the related data on vulnerabilities, threats and assets, be quickly, accurately and consistently entered into the RM/RA approaches and their software. Figure 18.2 depicts the area of focus in the ‘Process flow of the implemented Solution model’ diagram (Fig. 18.1). Possibly the best solution to this problem resides in the automated mapping of data from the Central Risk Catalogue to the RM/RA method software, which in this case is represented by the EBIOS and CORAS tools (used by Buyer and Supplier respectively). Two options were identified by which this could occur. The first option consisted of adding an export capability to the Central Risk Catalogue system, which would output data on risks in the machine-readable formats of common RM/RA approach software. This is beneficial because it would be a central point where numerous RM/RA software formats could be generated. Furthermore, it could take advantage of the ‘Import’ and ‘Open File/Project’, functionalities which are standard in a number of RM/RA software. For example, both CORAS and EBIOS tools have these capabilities.

214 Fig. 18.2 Area of focus in process flow of implemented solution model

J. R. C. Nurse and J. E. Sinclair Risks (assets, threats, vulnerabilities) catalogue risk to the confidentiality of Web services messages

risk to the confidentiality of Web services messages

Supplier’s CORAS methodology & software

New risks exchanged

Buyer’s EBIOS methodology & software

One caveat noticed when assessing the Risk Catalogue export capability option is that unique identification numbers (IDs) for elements (e.g., Menace IDs in EBIOS or risk-analysis-result IDs in CORAS) generated by the Central system might conflict with the same element IDs generated by the actual software running at each company. There would therefore need to be some agreed allotment of ID ranges for the Catalogue-based option to function properly. The second option suggests a more decentralized implementation where extensions could be added to the RM/RA software systems to enable them to read in and process Risk Catalogue system data. This would avoid the problem of conflicting IDs, but introduces the need to access, understand and edit various software systems. For this case, EBIOS and CORAS are good candidates in this regard as both are open source implementations (see [12] and [13] respectively). Apart from the programming that would be necessary in both options above, there is the question of exactly how to map Risk Catalogue system data to EBIOS and CORAS. This however can be largely addressed by reversing the mapping tables used as the basis for previous evaluation work in [8]. This is because the tool’s Entity Relationship Diagram (ERD) is not dissimilar to that of the Risk Catalogue system. Essentially, one would now be going from database records to EBIOS and CORAS software XML formats. Risk, ProjectRisk, Asset, Vulnerability and Threat are some of the main database tables mapped in [8] that would be used in reverse to map risks data from the Catalogue system. Having briefly digressed from the case study to discuss how transferring data from the shared risks catalogue could be addressed, the focus resumes at the RM/ RA software stage. This relates to the bottom two boxes in Fig. 18.2. After Supplier and Buyer have agreed the risks to be used, they conduct their individual analyses. This generally encompasses the processes of risk estimation, risk evaluation and treatment. The two code snippets below give an initial idea of the data generated by each entity’s RM/RA method. This and most of the following examples are based around a security risk defined by companies relating to the integrity and confidentiality of Web services messages passed between them during online interactions. Hereafter, this is referred to simply as Risk101; ‘Risk101’ is also used as the lower-level ID value originating from the risks catalogue which is employed in each company’s RM/RA software. From the code snippets, one can see exactly how different the representations of the same risk may be from company to company. As would be expected, a similar reality exists regarding the other types of data produced (e.g. related to risk factors, risk estimates, security actions and so on). The + sign in the code indicates that there is additional data which is not displayed/expanded considering space limitations.

18

A Case Study Analysis of an E-Business Security Negotiations Support Tool

215

+ <ScenarioPotentiality potentiality="Potentiality.1076645892186">

Code Snippet #-1. EBIOS (Buyer) representation of the risk Risk101 WSMessage Eavesdropping and tampering with data in a Web services' message (in transit) Medium Low

Code Snippet #-2. CORAS (Supplier) representation of the risk With the RM/RA methodologies at each business complete, the next step was mapping the output data from Buyer and Supplier to the tool. This process was covered in detail in [8] and therefore is not analyzed in depth here. From a case study perspective however, one intriguing additional observation was made—that is, although RM/RA methods did not accommodate certain data in a very structured way as expected by the tool, it did not mean that the data was not present in companies’ considerations. In Supplier’s CORAS software output shown in Code Snippet 3 for example, it is apparent that a limited security budget influenced Supplier’s treatment strategy decision (see treatmentDescription columnId). Any automated mapping to our tool therefore should ideally capture this data as a unique Risk Treatment factor. To recap, a treatment factor is an aspect that influences or in some way motivates a particular treatment for a risk. Common examples are laws, regulations, security policies, limited budgets and contractual obligations. Capturing this treatment data was not possible however because the machine-readable output of CORAS does not distinctly define such aspects in its XML structure. Here it is just in plain text. TRT101 Risk101 Retain The unlikeliness of this risk and a limited security budget are the reasons for risk acceptance Threat_Analysis09.doc

Code Snippet #-3. CORAS representation of a risk treatment A similar situation is present in Buyer’s EBIOS output regarding risk estimation. In this case, Buyer has used EBIOS to prioritize risks, however, because their technique is so elaborate it does not allow for a clear and reliable automated mapping to the risk level concepts in the tool.

216

J. R. C. Nurse and J. E. Sinclair

To tackle these mapping issues a few other techniques were assessed but manual mapping proved to be the only dependable solution. This mapping involved noting the type of data requested by the tool (such as influential security policies or budgetary limitations) and using its data entry screens to manually enter that data. This was easily done in this case through the creation of a TreatmentFactor record in the tool and then linking that record to the respective risk treatment, formally the SecurityAction database record. The TreatmentFactor table is used to store elements that influence or affect the treatment of risks. Examples of such were mentioned previously. Regarding the manual risk estimation and prioritization mapping needed for EBIOS mapping, a level of subjectivity would be introduced as users seek to map values in their analysistotherisklevelsexpectedinthetool.Tocompensateforthissubjectivity,detailed justifications and descriptions of chosen risk levels should be provided by parties. This information would be entered in the tool’s respective RiskEstimate database record’s risk_level_remarks, probability_remarks, impact_remarks and adequacy_of_ controls_remarks fields. (The RiskEstimate table defines the value of a risk, the probability and impact of it occurring, and the effectiveness of current controls in preventing that risk.) Generally, at the end of mapping, companies’ personnel should browse screens in the tool to ensure that all the required information has been transferred. The next step in the case study was encoding each business’s mapped data (now in the tool’s database) to SADML documents. This process went without error. In Code Snippet 4, an example of the security risk under examination (Risk101) is presented. The marked-up risk data has the same basis across businesses and documents due to the use of the shared risks base in the beginning. SADML provides the common structure, elements and attribute names. Different companies may add varying comments or descriptions however. The specific code in Snippet 4 is from Buyer. Eavesdropping and tampering with data in a Web services' message (in transit) Malicious party Circulating information in inappropriately secured formats property:data web service message The data carried in the message is the key aspect Violation of confidentiality using eavesdropping ...

Code Snippet #-4. SADML representation of the highlighted risk

18

A Case Study Analysis of an E-Business Security Negotiations Support Tool

217

The real difference in SADML documents across Buyer and Supplier is visible when it comes to the treatment of Risk101. In this case, Buyer aims to mitigate this risk while Supplier accepts it. SADML Code Snippet 5 shows this and the respective treatment factors. On the left hand side is Buyer’s document and on the right, Supplier’s. The + sign indicates that there is additional data which is not displayed here. <mitigationAction> Protect against eavesdropping on Web service messages being transmitted between partners <details>The organization must take measures to ensure there is no eavesdropping on data being transmitted between Web services across business parties. + + + <securityPolicyRefs> <securityBudgetRefs /> + <securityRequirementRefs>

The unlikeliness of this risk and a limited security budget are the reasons for risk acceptance <details>Threat_Analysis09.doc + <securityPolicyRefs> + <securityBudgetRefs> +

Code Snippet #-5. SADML representations of companiesGúø risk treatment choices When compared to the original output from EBIOS and CORAS, one can appreciate the use of the standard format supplied by SADML. In this respect, SADML provides a bridge between different RM/RA methods and their software systems, which can then be used as a platform to compare high-level security actions across enterprises. It is worth noting that the benefits possible with SADML are largely due to its foundation in the well-researched ontology from the Solution model [7, 11]. With all the preparatory stages in the case process completed, Fig. 18.3 displays the output of the model’s final stage i.e. the tool’s Comparison System which is presented to personnel at Buyer and Supplier. Apart from the user-friendly, colour-coded report, the real benefit associated with this output is the automation of several of the preceding steps taken to reach this point. These included: (i) gathering data from RM/RA approaches (such as EBIOS and CORAS), albeit in a semi-automated fashion; (ii) allowing for influential factors in risk treatment that are key to forthcoming negotiations, to be defined in initial stages; and finally, (iii) matching and comparing the security actions and requirements of companies based on shared risks faced. The output in Fig. 18.3 also aids in reconciling semantic differences across RM/ RA approaches as these issues are resolved by mapping rules earlier in the process (as covered in [8]). Furthermore, personnel from companies can refer to the

218

J. R. C. Nurse and J. E. Sinclair

Fig. 18.3 Area of focus in process flow of implemented solution model

ontology and the inclusive shared definitions/terminology at any point. This would be done to attain a clear understanding on terms in the context of the interactions. As parties come together therefore, they can immediately identify any conflicts in treatment choices and have the main factors supporting those conflicting choices displayed. This and the discussion above give evidence to show that in many ways, our tool has brought the interacting enterprises closer together—particularly in bridging a number of key gaps across companies. This therefore allows for an easier transition between the Requirement Elicitation and Negotiation phases in BOF4WSS. The shortcomings of the tool identified in this section’s case study centered on the manual effort needed at a few stages to complete data mapping. This acted to limit some of the Solution model’s automated negotiations support goals. To critically consider this point however, the level of automation and support that is present now would significantly bridge the disparity gaps and support a much

18

A Case Study Analysis of an E-Business Security Negotiations Support Tool

219

easier negotiation on security actions between parties. A small degree manual intervention therefore, even though not preferred, might be negligible. This is especially in business scenarios where there are large amounts of risks or security actions to be deliberated, and thus saving time at any point would result in substantial boosts in productivity.

18.4 Conclusion and Future Work The main goal of this paper was to extend initial evaluation work in [8] and pull together the compatibility evaluation of tool and generally the Solution model it embodies through the use of a full case study analysis. The findings from this new and more complete analysis are seen to supply further evidence to support the tool as a useful, feasible and practical system to aid in cross-enterprise security negotiations. This is especially in terms of BOF4WSS but there might also be other opportunities for its use in other collaborative e-business development methodologies. The main benefit of the tool and model are to be found in a much easier negotiation process which then results in significantly increased productivity for companies. There are two prime avenues for future work. The first avenue consists of testing the tool with other RM/RA techniques; IT-Grundschutz Manual [14] and NIST Risk Management Guide for Information Technology Systems SP800-30 [15] are some of the methods under investigation for this task. Positive evaluation results would further support the tool and any justified nuances of those popular techniques would aid in its refinement. The second avenue is more generic and looks towards the research and development of additional approaches and systems to support BOF4WSS. Considering the comprehensive and detailed nature of the framework, support tools could be invaluable in promoting BOF4WSS’s use and seamless application to scenarios.

References 1. PricewaterhouseCoopers LLP. Information Security Breaches Survey 2010 [Online]. Available: http://www.pwc.co.uk/eng/publications/isbs_survey_2010.html 2. Tiller JS (2005) The ethical hack: a framework for business value penetration testing. Auerbach Publications, Boca Raton 3. Nurse JRC, Sinclair JE (2009) BOF4WSS: a business-oriented framework for enhancing web services security for e-Business. In: 4th International Conference on Internet and Web Applications and Services. IEEE Computer Society, pp 286–291 4. Nurse JRC, Sinclair JE (2009) Securing e-Businesses that use Web Services — A Guided Tour through BOF4WSS. Int J Adv Internet Technol 2(4):253–276 5. Steel C, Nagappan R, Lai R (2005) Core security patterns: best practices and strategies for J2EETM, web services and identity management. Prentice Hall PTR, Upper Saddle River 6. Gutierrez C, Fernandez-Medina E, Piattini M (2006) PWSSec: process for web services security. In: IEEE International Conference on Web Services, pp 213–222

220

J. R. C. Nurse and J. E. Sinclair

7. Nurse JRC, Sinclair JE (2010) A solution model and tool for supporting the negotiation of security decisions in e-business collaborations. In: 5th International Conference on Internet and Web Applications and Services. IEEE Computer Society, pp 13–18 8. Nurse JRC, Sinclair JE (2010) Evaluating the compatibility of a tool to support e-businesses’ security negotiations. In: Lecture notes in engineering and computer science: Proceedings of the World Congress on Engineering 2010, WCE 2010, London, UK, pp 438–443 9. Yau SS, Chen Z (2006) A framework for specifying and managing security requirements in collaborative systems. In: Yang LT, Jin H, Ma J, Ungerer T (eds) Autonomic and Trusted Computing, ser. Lecture Notes in Computer Science, vol 4158. Springer, Heidelberg, pp 500–510 10. Todd M, Zibert E, Midwinter T (2006) Security risk management in the BT HP alliance. BT Technol J 24(4):47–52 11. Nurse JRC, Sinclair JE (2009) Supporting the comparison of business-level security requirements within cross-enterprise service development. In: Abramowicz W (ed) Business Information Systems, ser. Lecture Notes in Business Information Processing, vol 21. Springer, Heidelberg, pp 61–72 12. DCSSI (2004) Expression des besoins et identification des objectifs de securite (EBIOS)— Section 1–5, Secretariat General de la Defense Nationale. Direction Centrale de la Securitec des Systecmes D’Information, Technical Report 13. den Braber F, Braendeland G, Dahl HEI, Engan I, Hogganvik I, Lund MS, Solhaug B, Stolen K, Vraalsen F (2006) The CORAS model-based method for security risk analysis. SINTEF, Technical Report 14. Federal Office for Information Security (BSI). IT-Grundschutz Manual [Online]. Available: https://www.bsi.bund.de/EN/Topics/ITGrundschutz/itgrundschutz_node.html 15. National Institute of Standards and Technology (NIST) (2002) Risk management guide for information technology systems (Special Publication 800-30), Technical Report

Chapter 19

Smart Card Web Server Lazaros Kyrillidis, Keith Mayes and Konstantinos Markantonakis

Abstract In this article (based on ‘‘Kyrillidis L, Mayes K, Markantonakis K (2010) Web server on a SIM card. Lecture notes in engineering and computer science: Proceedings of the World Congress on Engineering 2010, WCE 2010, 30 June–2 July 2010, London, UK, pp 253–259’’) we discuss about the integration of a web server on a SIM card and we attempt an analysis from various perspectives (management, operation, security). A brief representation of the Smart Card Web Server (SCWS) will take place along with a use case that will help the reader to identify the way that an SCWS can be used in practice, before we reach to a final conclusion.

19.1 Introduction The World Wide Web (WWW) was a major step forward for humanity in terms of communication, information and entertainment. Originally, the web pages were static, not being changed very often and without any user interaction. This lack of interactivity led to the creation of server side scripting languages (like PHP) that allowed the creation of dynamic pages. These pages are often updated according to the users’ interests and in recent years even their content is created from the users (blogs, social networking, etc.). In order for these pages to be properly created L. Kyrillidis (&) Information Security Strategy Consultant, Agias Lavras 3, Neapoli, Thessaloniki, Greece e-mail: [email protected] K. Mayes K. Markantonakis Smart Card Centre, Royal Holloway, University of London, London, UK e-mail: [email protected] K. Markantonakis e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_19, Springer Science+Business Media B.V. 2011

221

222

L. Kyrillidis et al.

and served, a special type of computer program is required. This program known as a Web Server accepts the users’ requests, processes them and returns the result to the requesting user’s browser. Another important step for modern communications was the invention of mobile phones. The first devices suffered from fraud, because it was quite easy to intercept communications and clone phones. This inevitably led to the introduction of a secure, tamper-resistant module that could be used for securing the storage of sensitive information and cryptographic keys/algorithms that were used to secure communication between the phone and the network provider’s facilities. This module is referred to as the Subscriber Identity Module (SIM). The idea of hosting a web server on a SIM was proposed almost a decade ago [1] and although is not yet commercially available, recent technological advances suggest that the idea could be reconsidered. While the integration of SIM and web server could offer new fascinating prospects to both the network providers and the users, there is a security concern that it might also help attackers to gain access to the SIM contents. A practical concern is the extent of the added management, operation and personalization costs that this integration would entail.

19.2 Web Server on A Sim Card The Open Mobile Alliance is an international body that is formed to produce and promote standards for the mobile communications industry in order to encourage the interoperability of products aiming at lower operational costs and higher quality products for the end users [2]. One of the standards that OMA created, was the web server on the SIM card standard (Smart Card Web Server—SCWS) [3-5]. This standard defines a number of entities that the web server must contain: • SCWS The web server itself. It is located inside the SIM card. • SCWS gateway This entity is needed when the SIM card cannot directly respond to HyperText Transport Protocol (HTTP) requests, so the gateway’s main purpose is to translate the browser’s requests from HTTP to the local transport protocol and vice versa. A common local protocol would be the Bearer Independent Protocol (BIP) [6]. Additionally, the gateway is proposed to host a form of Access Control List (ACL) to control the access to the SCWS. It is located on the phone. • HTTP (s) Client The browser that will initiate requests towards the SCWS and will present the response to the end user. It is located on the phone. • SCWS Administration Application This entity is used for SCWS software updates/patches that would be applied remotely to the SCWS and for installing and/or updating possible web applications that may run on the SIM card. Additionally, it may be used to send new content in the form of HTML pages to the SCWS. It is located in the network provider’s premises.

19

Smart Card Web Server

223

19.2.1 Communication Protocols The SCWS will use two different protocols to communicate with entities outside of the SIM card. The first will be the BIP protocol to encapsulate HTTP(s) packets when the SIM card does not implement its own TCP/IP stack, while the second one will be the HTTP(s) for the newer smart cards that will allow direct HTTP access. We will now take a more detailed look at these two protocols and how they can be used with SCWS (according to [3-5]): 19.2.1.1 BIP Protocol As mentioned earlier, the BIP protocol will allow incoming and outgoing communication when the SIM card cannot support direct HTTP access. The SIM card will work into two modes: • Client mode The SCWS communicates with the remote administration entity to receive updates. The gateway translates requests from BIP to HTTP(s) (the SIM card can ‘‘speak’’ BIP, while the remote administration will ‘‘speak’’ HTTP(s)). • Server mode The SCWS communicates with the browser. The gateway is once again present, executing the translation between the BIP protocol that the SIM card understands and the HTTP(s) requests/answers that the browser understands. 19.2.1.2 TCP/IP Protocol If the SIM card implements its own TCP/IP stack (from Java card 3.0 and onwards), there will be no need for the gateway and the communication will take place directly between the two entities (either the SCWS and remote server or the SCWS and the browser).

19.2.2 Administration Protocols When the SCWS is in client mode this means that it exchanges messages with the remote server. There are two ways for this message exchange to take place: • When the amount of data that must be exchanged is relatively small, then the Lightweight Administration Protocol should be used. The bearer of the commands that encapsulate the data is an SMS or multiple SMS and when the SCWS receives these SMS(s), it parses it/them and then it must send a response back to the remote server, so that the latter can determine if it can send the next command or simply terminate the connection. • A second way that can be used for administration purposes is the Full Administration Protocol. A card administration agent that is located inside the SIM card is responsible to encapsulate and transfer the HTTP messages over PSK-TLS,

224

L. Kyrillidis et al.

to establish connection and if necessary reconnect if the connection is dropped. The agent sends a message to the remote server and the later responds with the administration command encapsulated within an HTTP response. The agent receives the command, passes it to SCWS which executes it. When the command is executed, the agent contacts the remote server for the next command. This operation continues until the remote server terminates the connection.

19.2.3 SCWS URL, IP, Port Numbers As mentioned earlier, there will be two ways to communicate with the SCWS: either over HTTP or BIP. The port numbers that will be used when the HTTP requests are encapsulated inside BIP packets are 3516 for HTTP and 4116 for HTTPs. The format of the URL in both cases will be: http://127.0.0.1:3516/file1/file2/test.html https://127.0.0.1:4116/sec_file1/sec_test.html If the access to the SCWS is provided directly over HTTP (s) the port numbers are the ones used for traditional web servers (80 for HTTP and 443 for HTTPs). The SIM card will now have its own IP address, so the loopback will no longer be needed. The format of the URL will be: http://\smart_card_IP[[:80]/file1/file2/test.html https://smart_card_IP[[:443]/sec_file1/sec_test.html

19.3 Using the SCWS for E-Voting A possible use for the SCWS is given in the following example: A country X provides to all its citizens an ID card that store (in addition to the citizen’s name and ID card number), two certificates (one for encryption/decryption, one for digital signatures), the corresponding private keys and the government’s public keys (for a similar example, see [7]). These certificates are also installed in a central location that is being administered by the government. The citizen can use his ID card for every transaction, either with the state or with other citizens. Additionally, the country has arranged with the mobile network providers to install these certificates/keys on the citizen’s mobile phone, so that the later can use it to vote. The voting process is described in the following use case:

19.3.1 Process Flow Let us suppose the following: CertA1 is the user’s certificate, PA1 the private key and PUA1 the public key used for encryption/decryption and CertA2 is the user’s

19

Smart Card Web Server

225

certificate, PA2 the private key and PUA2 the public key that are used for digital signatures. Likewise, CertB1 is the government’s certificate, PB1 the private key and PUB1 the public key used for encryption/decryption and CertB2 is the government’s certificate, PB2 the private key and PUB2 the public key that are used for digital signatures. Also let H be the hash algorithm that the two parties will use. The e-voting process is as follows: • The network provider has updated the user’s SCWS slightly by presenting a link on the user’s home page named ‘‘e-voting’’. • The user clicks on the link. • The user’s name and ID card number are encrypted with PUB1; additionally both of them are hashed to produce the hash HA and signed with PA2. All these are sent to the government’s remote server that hosts the e-voting site. ðIDA ; NameA ÞPUB1 ðH ðIDA ; NameA ÞÞPA2 ! Voting Server • The remote server decrypts (IDA, NameA) PUB1 using PB1, extracts PUA2 from CertA2 (which it already knows), verifies (H (IDA, NameA)) PA2 using PUA2 (and gets HA), hashes the (IDA, NameA) using H (and gets HB), and checks HA against HB. If the two hashes much each other, the server authenticates the user and may proceed with the rest of the process. Then, it checks if the citizen has voted again and if not it creates a temporary entry in a database to show that the user’s voting is in progress. • After the user is authenticated to the server, he is presented with a link that points to the IP of the SCWS. The user clicks on the link and he is transferred to the SCWS environment. At the same time the remote server hashes the (IDA, NameA), sends it signed with PB2 and also sends an encrypted link L which has embedded authentication data that will be used later on from the user (to authenticate himself on the remote site instead of providing a username/password): ðHðIDA ; NameA ÞÞPB2 ðLÞPUA1 Voting Server • The SCWS receives the (H (IDA, NameA)) PB2 and verifies it using PUB2. Then it hashes the user’s ID and name that are stored on the SIM card with H and if the two hashes much each other, the server is authenticated and the SCWS can now prompt the user to provide the PIN. Additionally, the SCWS decrypts L using PA1. • The user provides the PIN, it is checked by the SCWS and if it is correct the SCWS displays the link L that points to the remote server. • The user clicks on the link and can now browse the voting site and vote. When his voting is done, the permanent entry in the database is updated, to show that the user has voted.

19.3.2 Comments on this Use Case Someone can argue about the need to use an SCWS for e-voting. While this document cannot explore the law and ethical issues that arise because of the

226

L. Kyrillidis et al.

sensitive nature of the elections, there are some reasons that can justify its use. The first is that a large part of the population is familiar with using a browser by using it in its day-to-day internet access. However, it is fairly easy for people to learn how to use it, even if they do not have previous experience. Another important reason is that the security needed for the e-voting (and other similar uses like e-shopping) can be provided by using the SCWS. The SIM card is the most secure token in mass production at the moment and can easily store all the sensitive information needed (certificates, keys, personal information). After all, even if a phone or SIM card is lost or stolen, it will be quite difficult for someone to extract the necessary information and by the time that he manages to do so, the certificates/keys will, most probably, be revoked. A third reason is the transparency of the process. As mentioned earlier, the user needs to know how to use a browser and nothing more. All the necessary message exchange takes place without user interaction except when entering the PIN number and this allows for more complex protocols and longer cryptographic keys to be used. The most difficult part of the overall process is when it comes to define who will be responsible for storing all these certificates/keys on the SIM card. Are the mobile operator companies trusted to install this sensitive information on the SIM cards, and in case they are not, will they allow the government to use their facilities? What happens with lost/stolen phones (revoking of the certificates), or simply when a user changes phone and/or network provider? Additionally, it must be ensured that everything runs smoothly, so that the election result is not disputed and that the legitimate user can vote only once. This is quite a challenge as although the Internet is used by more and more people for all kinds of different purposes, it is far from being characterized as a secure environment. The ability that it provides for shopping, communicating, etc. intrigues malicious people and offers them a whole new environment where they can launch their attacks against unwary users. A number of malicious programs are created every day, including Trojan horses, viruses, rootkits and other attack software that is used for data theft, communications corruption, information destruction, etc. Although the SIM card is designed as an attack/tamper-resistant platform, extending its ability to serve HTTP(s) requests will make the SIM card and mobile phone even more attractive attack targets.

19.3.3 Secure Communication Channel There are two ways for the SCWS to communicate with entities outside of the SIM card environment. The first way is when the communication is between the SCWS and a remote server in order for the former to receive updates and the second one is the communication with the phone’s browser when the user submit requests to the SCWS. Both communication mechanisms need to be protected adequately.

19

Smart Card Web Server

227

The communication between the SCWS and the remote server is of vital importance, because it provides the necessary updates to the SCWS from a central location. The symmetric cryptography can provide the necessary level of security through the use of a pre-shared symmetric key [8]. The key has to be strong (long) enough so that even if the communication is eavesdropped, an attacker cannot decrypt or alter it. In addition, this offers mutual authentication, because the key is only known to the two entities, thus every message encrypted by that key can only come from a trusted entity. The security of communication between the SWCS and the browser is also very important and the necessary level of protection can be provided with in a variety of ways. As with the traditional Internet, the user can either use the HTTP or the HTTPs protocols to communicate with the SCWS. If the browser on the mobile phone requests information that needs little or no security, the communication can pass over the HTTP protocol, while communication that is sensitive is protected via HTTPs, thus offering confidentiality, integrity and authentication. Another security measure that can offer a second level of security is the use of the PIN to authenticate the user to the SCWS. The final measure that can be utilized is through the use of some form of ACL that will allow applications meeting certain trust criteria, to access and communicate with the SCWS, while blocking non trusted applications.

19.3.4 Data Confidentiality/Integrity The SCWS handles two kinds of data: data stored on the SIM card and data in transit. The first kind of data has an adequate level of protection as modern smart cards are designed to strongly resist unauthorized access to the card data. An attacker should need costly and advanced equipment, expert knowledge and a lot of time, as modern smart cards have many countermeasures to resist known attacks [9]. Data in transit cannot benefit from the protection that the card offers and is more exposed to attacks. If data in transit does not pass over secure channels with the use of the necessary protocols, this may lead to data that is altered, destroyed or eavesdropped. Measures must be taken so that these actions are detected and if possible, prevented. An attacker may alter data in two ways: by just destroying a message (or transforming it into a meaningless one) or by trying to produce a new version of the message with an altered, meaning. The first attack simply wants to ‘‘break’’ the communication, while the second one aims to exploit a weakness e.g. to execute malicious commands against the server. It is obvious that the latter is far more difficult, especially when the message is encrypted or hashed. In the SCWS context, the messages are mostly the commands exchange between the SCWS and the remote server for remote administration or between the SCWS and the user’s browser. The alteration of the exchanged messages can be avoided by adding a

228

L. Kyrillidis et al.

MAC at the end of the message when there is a pre-shared key (in the case of the remote administration) or by using digital signatures when a pre-shared key cannot be (securely) exchanged. Using any form of strong encryption can provide the necessary confidentiality needed for the data that is handled by the SCWS and symmetric or public-key cryptography can be used according to needs. OMA proposes the use of PSK-TLS for confidentiality/integrity between the SCWS and the remote server and public key cryptography for the communication between the SCWS and the various applications on the phone. On the second case the use of PSK-TLS is optional.

19.3.5 Authentication For authentication purposes, OMA proposes the use of Basic Authentication and optionally the use of Digest Authentication [10]. While the former can be used when there is no or little need for authentication, in case that an application/entity needs to authenticate the SCWS and vice versa, the use of Digest Authentication is mandatory.

19.4 Management Issues Managing a web server is a complicated task, because of all the different possibilities that exist for setting it up and tuning it. On top of that, the administrator must pay attention to its security and implement necessary countermeasures so that the server is not an easy target to possible attacks. Additionally, he must pay attention to setup correctly the scripting server-side/ scripting language(s) that the web server will use in order to avoid setup mistakes that affect a large number of servers e.g. PHP’s register_global problem [11]. Therefore to setup a web server correctly, an administrator must be aware of all the latest vulnerabilities, which could be quite a challenging task, especially as correcting a wrong setup option may sometimes lead to corrupted programs/websites that do not work [12]. The SCWS is not affected by these issues. Most probably it will be setup from a central location (the network operator’s facilities), and must be carefully managed, because otherwise it may introduce vulnerabilities to attacks against it and against the SIM card itself. Most probably the SCWS will have a common setup for all its instances. An important problem is what to do with older SIM cards that may not be able to host a program like a web server (even a ‘‘lightweight’’ web server like the SCWS). This may lead to a large number of users being unable to have the SCWS installed. Finally, any patches/updates needed for the SCWS will be installed from a central location, meaning that this adds another burden to the network provider.

19

Smart Card Web Server

229

19.5 Personalization At the beginning, the Internet was a static environment with content that was presented to the user ‘‘as is’’, without interaction. However, since the arrival of Web 2.0 this has changed radically; now it is often that the user ‘‘creates’’ and personalizes the content [13]. Social networking sites, blogs and other internet sites allow a user to be an author, to present his photographic skills, to communicate with people from all over the world, to create the content in general. This advance in Internet interactivity, allowed a number of companies, to approach the user offering services or products that were of interest according to Internet ‘‘habits’’ e.g. a person that visits sports sites would be more interested in sports clothing than a person that visits music sites. The idea of the personalized content can be applied to the SCWS environment as well. Network companies may offer services that interest their clients based not only on their needs/interests, but e.g. if a client is in a different country, the company may provide information about that place (museums, places of interest, hospitals, other useful local information) and send this information to the SCWS. The user can then easily, using his phone browser, access the SCWS content, even if the user is offline (not connected to the internet). One issue is to define who will be the creator of the content. Will this be the network provider or will third parties be allowed to offer content as well? Giving access to a third party may be resisted for business reasons and also the potential for undermining the security of the platform. From a practical viewpoint, multiple applications are not too much of a challenge for the modern SIM card platform, as it is designed to permit third parties to install, manage and run applications.

19.6 Web Server Administration The administration of a web server can be quite a challenging task, due to the fact that the server must run smoothly, work 24/7/365 and serve from a few hundred to even million requests (depending on the sites that is hosts). The SCWS will not be installed in a central location like a traditional web server, but rather it will be installed in a (large) number of phones. The web server can come pre-installed on the SIM card, and when an update/patch must take place this can happen with one of the following two ways: either centrally with mass distribution of newer versions/updates or by presenting a page to the user, for self download and install. Third party applications that may be installed and run as part of the web server can be administered in the same way.

230

L. Kyrillidis et al.

19.7 Web Server’s Processing Power and Communication Channel Speed Depending on the need, a web server can be slow or fast. A server that has to support a hundred requests does not need the same bandwidth or the processing power as one that serves a million requests. The supporting infrastructure is of huge importance, so that users’ requests are handled swiftly even in the case of some server failures. Additionally, if the performance of the communications bearer does not match the processing power of the web server itself (and vice versa), the user’s perspective of the overall performance will be poor. The SCWS will not serve requests for more than a user (if we assume that the SCWS is only accessible from inside the phone), so one can say that processing power or the communication channel is not of huge importance. However, even if the SCWS has to respond to only one request at a time, this may still be a demanding task if the processing power of the SIM card is still small, especially if the SCWS has to serve multimedia or other resource consuming content. The same concern applies to the communication channel speed. A traditional web server may have a fast line (or maybe more than one) along with backup lines to serve its requests. The SCWS cannot only rely on the traditional ISO 7816 interface [14]. This interface that exists in most of the phones at the moment is too slow to serve incoming and outgoing requests to and from the SCWS. This need is recognized by ETSI and it is expected in the near future that the 7816 interface will be replaced with the (much) faster USB one [15]. So, the necessary speed can be provided only when the USB interface is widely available.

19.8 IP Mobility A web server that changes IP addresses is a web server that may not be as accessible as it must be, because every time that its IP address is changed a number of DNS servers must be informed and their databases to be updated. This update may require from several hours to a couple of days. This is the reason that all web servers have static IP addresses [16], so that every time that a user enters a URL, he knows that there will be a known match between that URL and an IP address. At the beginning the SCWS will not be as accessible as a traditional web server. This is because (as mentioned before) it will serve requests initiated from one user only, as it will be accessible only from the inside of the phone and most probably its IP address will be the 127.0.0.1 (loopback address). However, if the SCWS becomes accessible to entities from outside of the phone, this means that it cannot answer to requests destined to the loopback address only and it will need a public accessible address. Although this can be solved and each SCWS can have a static IP address, what will happen when the user is travelling? After all, IP address ranges are assigned to cities or countries and so a PC has an IP within a certain range when being in London, UK and another when being in Thessaloniki, Greece.

19

Smart Card Web Server

231

A phone is a mobile device which is often transferred between cities, countries or even continents and so it needs a different IP address every now and then. This means that if the SCWS becomes public accessible and serves requests to entities from outside the phone, there must be a way to permanently match the SWCS’s URL to a certain IP. While this cannot happen because we are talking about a mobile device, the answer to this problem can be found within RFC 3344 and RFC 3775 (for mobile IPv4 and IPv6 respectively). Briefly, the two RFCs use a home address and a care-of-address. The packets that are destined for the home address are forwarded to its care-of-address (which is the phone’s current address). This binding requires co-operation between the network providers, but if it is setup correctly it can enable the phone to move without problems and permit the SCWS to serve requests smoothly [17, 18].

19.9 Conclusion To predict the future of the SCWS is not an easy task, but it surely may be interesting. Obstacles associated with the SIM cards’ limited processing power and low bandwidth communication channel may be overcome by advances in current technology. Newer SIM cards have bigger memory capabilities; faster processing units and the USB interface will provide the necessary communication speed. However, technical issues alone are unlikely to decide the future of SCWS, as this will be primarily determined by the network providers based on profitability, potential security vulnerabilities and user acceptance.

References 1. Rees J, Honeyman P (1999) Webcard: a java card web server. Center for Information Technology Integration, University of Michigan 2. Open Mobile Alliance. http://www.openmobilealliance.org/ 3. OMA, Enabler Release Definition for Smartcard-Web-Server Approved Version 1.0-21 April 2008, OMA-ERELD-Smartcard_Web_Server_V1_0-20080421-A 4. OMA, Smartcard Web Server Enabler Architecture Server Approved Version 1.0-21 April 2008, OMA-AD-Smartcard_Web_Server_V1_0-20080421-A 5. OMA, Smartcard-Web-Server Approved Version 1.0-21 April 2008 OMA-TS-Smartcard_ Web_Server_V1_0-20080421-A 6. ETSI TS 102 223 7. AS Sertifitseerimiskeskus, The Estonian ID Card and Digital Signature Concept Principles and Solutions, Version 20030307 8. Menezes AJ, van Oorschot PC, Vanstone SA (1996) Handbook of applied cryptography. CRC Press, USA, pp 15–23, 352–359 9. Rankl W, Effing W (2003) Smart card handbook, 3rd edn. Wiley, New York, pp 521–563 10. Internet Engineering Task Force, HTTP Authentication: Basic and Digest Access Authentication. http://tools.ietf.org/html/rfc2617 11. PHP Manual, Using Register Globals. http://php.net/manual/en/security.globals.php

232

L. Kyrillidis et al.

12. Esser S, $GLOBALS Overwrite and its Consequences. http://www.hardened-php.net/ globals-problem (November 2005) 13. Anderson P (2007) What is Web 2.0? Ideas, Technologies and Implications for Education. Technology & Standards Watch, February 2007 14. Mayes K, Markantonakis K (2008) Smart cards, tokens, security and applications. Springer, Heidelberg, pp 62–63 15. ETSI SCP Rel.7 16. Hentzen W, DNS explained. Hentzenwerke Publishing. Inc, USA pp 3–5 17. Internet Engineering Task Force, IP Mobility Support for IPv4, http://www.ietf.org/ rfc/rfc3344.txt 18. Internet Engineering Task Force, Mobility Support in IPv6, http://www.ietf.org/rfc/ rfc3775.txt

Chapter 20

A Scalable Hardware Environment for Embedded Systems Education Tiago Gonçalves, A. Espírito-Santo, B. J. F. Ribeiro and P. D. Gaspar

Abstract This chapter presents a scalable platform designed from scratch to support teaching laboratories of embedded systems. Platform’s complexity can increase to offer more functionalities in conjunction with student’s educational evolution. An I2C bus guarantees the continuity of functionalities among modules. The functionalities are supported by a communication protocol presented in this chapter.

20.1 Introduction Embedded systems design plays a strategic role from an economic point of view and industry is requiring adequately trained engineers to perform this task [1, 2]. Universities from all over the world are adapting their curriculums of Electrical Engineering and Computer Science to fulfill this scenario [3–7]. An embedded system is a specialized system with the computer enclosed inside the device that it controls. Both, low and high technological products are built following this concept. T. Gonçalves (&) A. Espírito-Santo B. J. F. Ribeiro P. D. Gaspar Electromechanical Engineering Department, University of Beira Interior, Covilhã, Portugal e-mail: [email protected] A. Espírito-Santo e-mail: [email protected] B. J. F. Ribeiro e-mail: [email protected] P. D. Gaspar e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_20, Springer Science+Business Media B.V. 2011

233

234

T. Gonçalves et al.

The complexity of an embedded system changes from one product to another, depending on the task that it must perform. Therefore, embedded system designers must have knowledge from different areas. The development of hardware projects requests knowledge related with digital and/or analog electronics, and, at the same time, with electromagnetic compatibility issues, that cannot be forgotten in high frequency operation or in products that must work in very restrictive environments, as the ones found in hospitals. Therefore, the designer must project the firmware, required by the hardware, allowing it to work as expected. Subjects like operating systems, real time systems, fixed and floating-point arithmetics, digital signal processing, and programming languages as assembly, C/C++ or Java are of major relevance to the development of embedded systems. Curriculums to teach embedded systems are not well established, unlike the classic knowledge areas where is possible to find textbooks to support students study. Different sensibilities are used to structure curriculums in embedded systems. The ARTIST team has established the competencies required by an embedded software engineer. This proposal highlights practices as an essential component of education in embedded system [8]. As stated previously, the skills that an embedded systems designer must hold are highly complex and spread across different areas [9, 10]. If beyond this knowledge, the student needs also to learn how to work with a complex development kit, then he/she will probably fail. Even if the student is adequately prepared in the essential subjects, the time taken to obtain visible results is, sometimes, responsible for student demotivation and consequent failure.

20.2 Hardware Platform Design Overview The development of an embedded system relies on a large and assorted set of technologies in constant and rapid evolution. The learning platform here described wish to improve student accessibility to this set of technologies in an educational environment [11]. This way, the usage of commercial evaluation kits are discouraged, since they are mainly developed to observe the potentialities of a specific product without educational concerns. The MSP430 family selection is justified by the amount of configurations available, with a high count of peripherals, different memory sizes, and popularity. Another relevant attribute is the learning curve of these devices that, from authors’ experience, allows to rapidly obtain results [12]. The developed platform has actually four modules (see Fig. 20.1), with increasing complexity, but others can be developed in the future. The learning platform can thus satisfy the needs from beginners to experienced students, and, at the same time, has a high potential of evolution. The learning platform here presented allows users to experiment with different kinds of interfaces, such as OLED display, seven-segment displays and conventional LED. Students can interact with several other peripherals, as for example, among others: pressure switches, a touch-screen pad, a joystick, and an

20

A Scalable Hardware Environment for Embedded Systems Education

235

Fig. 20.1 Example of one allowed configuration—Module 0 and Module 1 connected

accelerometer. An overview structure of the developed modules can be observed in Fig. 20.2. The modules were constructed around three devices from the MSP430 family: MSP430F2112, MSP430FG4618 and MSP430F5419. The strategy adopted to design the teaching platform took in consideration the following characteristics: • Modules can exchange data through an I2C bus. • Each module has two microcontrollers, one manage the communications, while the other it is available for student’s work. • A SPI bus connects both processors in the same module. • Energy consumed in each module can be measured and displayed in real time. • All modules have standard dimensions. Master

Slave1

Module function

Slave 2 Module function

Module function

SPI

SPI

SPI Router Buffers

Tx Buffer

Tx

Rx

I2C

Rx Buffer

Tx Buffer

Rx Buffer

Tx Buffer

I2C

I2C Bus

Fig. 20.2 Structure overview of the teaching environment for embedded system

Rx Buffer

I2C

236 Fig. 20.3 Communication infrastructure

T. Gonçalves et al. Program and debug

SPI 2 Reset UART

Power MSP430F2112

4

I/O

JTAG

I2CAdd UART

I2C

ADC

4

4

2

BCD Encoder

2º Order filters S1 Current sensors

S2

Aux1 Auxiliary Aux2 inputs

Expansion Bus

The communication management infrastructure shown in Fig. 20.3 implements the protocol described in Sect. 20.4. All the functionalities specified for the Module 0—Basic Interface and Power—and for the Module 1—Basic Interface— are implemented with the MSP430F2112 microcontroller. This device can operate with a maximum clock frequency of 16 MHz, it has 32 kB of flash memory, and 256B of RAM, one 10-bits ADC, two timers with respective compare/capture units, and a number of digital IO sufficient to satisfy the needs. The device also supports SPI, UART, LIN, IrDA, and I2C communication protocols. A BCD encoder sets the address of the module in the I2C bus. Four LED are used to show the communication status. A pressure switch can reset the communication management hardware. An SPI channel connects the user’s microcontroller and the microcontroller used to manage the communications, which in turn is connected to the I2C bus. The MSP430 family has good performance in situations where the energy consumption is a major concern. The energy consumed by each module can be monitored and displayed in real time. Current is measured at two distinct points. At measuring point S1 the current of the user’s microcontroller is acquired with a current shunt monitor (INA198). The total current required by the whole module is acquired at the measuring point S2 with a precision resistance. Two other analogue inputs are also available. The student can use them to acquire the working module voltage. The energy consumed is computed with the knowledge of the current and voltage.

20.3 Platform Modules in Detail 20.3.1 Module 0: Basic Interface and Power This module was developed to help students’ first steps in embedded systems. Usually, this kind of user does not have any experience with the development of embedded systems. This module will help him to take contact with the

A Scalable Hardware Environment for Embedded Systems Education

Fig. 20.4 Module 0—structure of the basic interface and power module

237

BCD Encoder

Battery Exter. Power JTAG

Program and debug SEL Current sensor

DC-DC Voltage regulator

S1

JTAG

I/O

MSP430F2112

BCD Encoder

4

I/O

20

4

BCD Decoder

DATA 4

7 Current sensor S2

UART

I/O

SPI 4

Communication management

ENDIS 4

I2C 2 Expansion Bus

microcontroller architecture and, at the same time, with the software development tool. The internal structure of the module is illustrated in Fig. 20.4. Beyond providing power to itself, the module can also power the modules connected to it through the Expansion Bus. Three different options are available to power the system: a battery, an external power source, or the JTAG programmer. The DC–DC converter allows an input range from 1.8 to 5.5 V, providing a 3.3 V regulated output. Powering from the JTAG is only available to support programming and debugging activities. A numerical display was built with four seven-segment independent digits. Four data lines (DATA) control the writing operation. A BCD decoder allows writing the desired value in the display. The selection of which digit will be written, at a specific moment, is performed by four control lines (ENDIS). Because the BDC decoder does not latch the output, the microcontroller must continually refresh the value to exhibit in the display with a minimal frequency of 15 Hz. Despite the simplicity of this module, its predefined task is the visualization of the current and the energy consumed by each one of the modules connected by the Expansion Bus. Two BCD encoders, with ten positions each, are connected to the microcontroller through eight selection lines (SEL), and are used to select from which module the information will come. To execute this feature, the user’s microcontroller must be loaded with a specific firmware.

20.3.2 Module 1: Basic Interface This module is directed to the student that already has some basic knowledge in the embedded systems field. As the previous module, this is also based in the MSP430F2112 microcontroller. Connected to this device, as can be seen in Fig. 20.5, can be found eight switches, eight LEDs, and a seven-segment display with two digits.

238

T. Gonçalves et al.

Fig. 20.5 Module 1—structure of the basic interface module

Program and debug JTAG MSP430F2112 UART

4 4

I/O

Power

DATA

BCD Decoder

I/O

7

Sw1 SPI 2

7

Buffer/ Driver 8

Vcc

S2 Sw8 S1

Current sensor

Current sensor

BCD Decoder

Communication management I2C 2 Expansion Bus

This module intends to develop student’s competences related with synchronous and asynchronous interruptions. Simultaneously, the student can also explore programming techniques, as for example, the ones based in interrupts or port polling to check the status of the digital inputs or impose the status of the digital outputs. With this module, the student can have the first contact with the connection of the microprocessor to other devices, compelling him to respect accessing times. Two BCD decoders, with the ability to latch the outputs, are used to write in the display. While the DATA lines are used to write the value in the display, the LE lines are used to select which digit will be written. The DATA data lines are also used, with a buffer/driver, to turn on or off each one of the eight LED. The user can configure the inactive state of the switches.

20.3.3 Module 2: Analog and Digital Interface Students with more advanced knowledge in the embedded systems field can use this module to improve their capabilities to develop applications where human– machine interface, analogue signal conversion, and digital processing are key aspects. The design of this module was performed around the MSP430FG4618 microcontroller. This device can operate at maximum clock frequency of 8 MHz, it has 116 kB of flash memory and 8 kB of RAM. The high count of digital IO is shared with the on-chip LCD controller. Other peripherals that are normally used for analog/digital processing are also present: 12-bits ADC, two DAC, three operational amplifiers, DMA support, two timers with compare/capture units, a high number of digital IO, and hardware multiplier. This device can also support SPI, UART, LIN, IrDA, and I2C communication protocols. On-chip LCD controller allow the connection of LCD with 160 segments. On-board can be found a navigation joystick with four positions and a switch, a rotational encoder with 24 pulses/turn and a switch, a speaker output, a microphone input, generic IO, and a

A Scalable Hardware Environment for Embedded Systems Education

Fig. 20.6 Module 2—structure of the analog and digital interface module

239

Program and debug SEG

JTAG

Power

LCD Controller

20

MSP430FG4818

UART

AmpOp

32 COM 4

I/O OUT

SPI 2

Digital I/O

IN

S2 S1

Current sensor

Current sensor

Communication management

IO PORT 20

Micro

NAV 5

ROT 3

Joystick

Rotational Encoder

sw1

sw2

Speaker

I2C 2 Expansion Bus

alphanumeric LCD. The internal structure of the module 2—Analog and digital interface—is illustrated in Fig. 20.6. Students can explore the operation of LCD devices, taking advantage from the on-chip LCD controller. The module can also be used to improve the knowledge related with the development of human–machine interfaces. An example of a laboratory experience that students can perform with this module is the acquisition of an analogue signal, condition it with the on-chip op-amps, and digitally process the conversion result with a software application. The result can be converted again to the analog world using the on-chip DAC. Taking advantage from the on-chip op-amps it is possible to verify the work of different topologies, as for example: buffer, comparator, inverter non-inverter, and differential amplifier with programmable gain. The digitalized signal can be processed using the multiply and accumulate hardware peripheral. Students can conclude about the relevance of this peripheral in the development of fast real time applications.

20.3.4 Module 3: Communication Interface This is the most advanced module. With this module, the student has access to a set of sophisticated devices that are normally incorporated in embedded systems. The student can explore how to work with: an OLED display with 160 9 128 pixels and 2,62,000 colors, a three axis accelerometer, a SD card, a touch-screen, an USB port, two PS/2 interfaces. The module also owns connectors to support the radio frequency modules Chipcon-RF and RF-EZ430. The Module 3—Communications interface module—illustrated in Fig. 20.7, was built around a microcontroller with high processing power. The MSP430F5419 can operate with a maximum clock frequency of 18 MHz, and it has 128 kB of flash memory and 16 kB of RAM. This microcontroller also has a hardware multiplier,

240

T. Gonçalves et al.

RF-EZ430

Chipcon RF

TX/RX 2

SPI 4

USB Controller

SUSPUSB

2 2 TX/RX

I2C

MSP430F5419

S1

Current sensor

SPI 4

I/O USCIB1

8

M_CLK

M_DATA Level Shifter

M_CLK

Level Shifter

KB_DATA

KB_CLK KB_DATA

PS2 Connector

KB_CLK PS2 Connector

VTSCEN

2 I2C PENIRQ

Communication management

SD Connector

SPI 4 M_DATA

I/O

USCI??

OLED Display

Data

I/O

Power

SPI 4

Timer B

JTAG

Program and debug

USCIA1 I/O USCIB0

VOLEDEN

USCIA3 USCIA0 USCIA3 USCIA3

Current sensor S2

EPROM

I2C INTACC

I2C 2

TouchScreen Controller

4 wire TouchScreen

3 axis Accelerometer

VACCEN Expansion Bus

Fig. 20.7 Module 3—structure of the communication interface module

a real time clock, direct memory access, and a 12-bits ADC. The communication protocols SPI, UART, LIN, IrDA, and I2C can be implemented in four independent peripherals. The device has three independent timers with compare/capture units. The accelerometer MMA7455L, with adjustable sensibility, uses an I2C interface to connect with the microcontroller in the port USCIB1. It has two outputs that can signal different conditions, like data available, free fall, or motion detection. This device can be enabled by the microcontroller through the VACCEN line. The resistive touch-screen with four wires, and a 6 9 8 cm area, needs a permanent management of their outputs. To free the microcontroller from this task, a touch-screen controller is used to connect it with the microcontroller through I2C bus (USCIB1). The line PENIRQ notifies the microcontroller that the

20

A Scalable Hardware Environment for Embedded Systems Education

241

touch-screen is requesting attention. To save power, the microcontroller can enable or disable the touch-screen controller through the VTSCEN line. The two PS/2 ports are connected to the microcontroller by a bidirectional level shifter to adapt the working voltage levels from 5 to 3.3 V. The clock signal is provided by the Timer B. The communication port USB uses a dedicated controller that makes the interface with the UART (USCIA2). The USB controller firmware is saved in an EPROM that can communicate with the USB controller or with the microcontroller by an I2C bus (USCIB3). The USB can be suspended or reseted by the microcontroller though the line SUSPUSB. The SD socket is only a physical support to allow the connection of the card to the microcontroller through a SPI (USCIA1). Two additional lines enable detection and inhibition of writing operations. The OLED display support two different interface methods: 8-bits parallel interface, or SPI (USBI0). The parallel interface requests a specific software driver. A simpler interface can be implemented with the SPI bus. The radio frequency interfaces allow the connection of a RF module from the Chipcon, connected to the microcontroller by an SPI interface (USCIA3). The RF-EZ430 radio frequency module can be connected through an UART (USCIA0).

20.4 Communications Protocol Data exchange between modules is based upon two communication methods: (1) the communication method between the module application function and the network function; and (2) the communication method supported by the bus, which has the main task of interconnecting all modules, working as router among them. This characteristic gives to the system a high versatility, allowing increasing the set of applications supported. The adopted solution was a serial bus, being the key criteria of this choice: the physical dimensions, the maximum length of the bus, the transmission data rate and, of course, the availability of the serial communications interface. The I2C communication protocol is oriented to master/slave connection, i.e., the exchange of data will always occur through the master. This leads to the definition of two distinct functional units. Although, this technology supports multi-master operation, it was decided to use a single master, giving to the network a hierarchical structure with two levels. All the possibilities of information transaction at the bus physical level between the master and the slave are represented in Fig. 20.8. The master begins the communication sending a start signal (S), followed by the slave address (Adress X). The slave returns the result to the master after the reception and execution of the command. The procedure to send and receive data from and to the slave is also represented. The master starts the communication sending the start signal (S), followed by the slave address. The kind of operation to perform (read or write) is sent next. The ACK signal sent by the I2C controllers is not represented in the figure.

242

T. Gonçalves et al.

Master sends command to slave I2C S

Slave Address X

W

Command

S

Data

P S

Slave Address X

R

Command return

P S

Master gets data from slave I2C S

Slave Address X

R

S

Bus start condition

P S

Bus start or stop condition

W

Bus write operation

R

Bus read operation

Master sends data to slave I2C S

Slave Address X

W

Data

P S

Fig. 20.8 Typical master/slave data exchange at the I2C bus level

The module 0—Basic interface and power—performs the master role. This choice is justified by the fact that this module has an obligatory presence in the network. If slave unit B wants to send a message to the slave unit A, first the message must be sent to the master, which will resend it to the destination unit. To satisfy the specifications, a communication frame was defined as can be observed in Fig. 20.9. In order to implement the communication service layer, a media access protocol was outlined with the mechanisms required to exchange information between modules. Therefore, the service frame and the communication protocols between the master unit and the slaves are defined as follows. The service frame has three different fields. The header field, with two bytes length, has the information about the message type. The routing information includes sender and receiver addresses, and information about the length of the payload in bytes. The payload field is used to carry the information from one module to another. Finally, the checksum field, with one-byte length, controls communications integrity. The service layer uses a command set with three main goals: network management and setup; information transaction tasks synchronization; and measurement. The service layer protocol has three different types of frames. The data frame is used to transport the information between the applications running in the modules. The command frame establishes the service layer protocol. The message type is specified in the sub-field frame type of the header field as reported in Table 20.1. The command set is listed in Table 20.2. At power on, the master will search for slaves available to join the network. The master achieves this task sending the command ‘‘Get_ID request’’ for all available slave address, which will be acknowledged by the slaves present responding the command ‘‘GET_ID response’’.

20

A Scalable Hardware Environment for Embedded Systems Education

Fig. 20.9. General frame format

3Bit

243

1Bit

Frame type

Data pending

4 Bit

4 Bit

4 Bit

4 Bit

Message Control

Source Adress

Destination Adress

Data Length

Header

Payload

3 Byte

Command

CheckSum

Max 8 byte

1Byte

Command data

Or

High layer data

Table 20.1 Values of the frame type subfield

Table 20.2 Command frame

Frame type value b2 b1 b0

Description

000 001 010 011 100–111

Reserved Data Acknowledgment MAC command Reserved

Command frame identifier

Command name

Direction

0x00 0x01 0x02 0x03 0x04 0x05 0x06 0x07

Heart _Beat Request Heart_Beat Response Get_ID Request Get_ID response Software Reset Request Get_Status Request Req_Data Request Req_Data Response

M -[ S S -[ M M -[ S S -[ M M -[ S M -[ S M -[ S S -[ M

After the successful setup of the network, the master periodically executes the data polling operation. The task checks all slaves for pending messages. This operation allows data exchange among all the units connected by the I2C bus. The memory available in the master to support the communication task is limited,

244

T. Gonçalves et al.

1- >M 2- >M 3-> 1

3 M

M 3

M 2

1 M

Get data from Slaves Send data to slaves

2- >M

1->3

3 M

M 2

M 1

Polling Period

3-> 2

2->M

1 3

1 2

Input buffers (End period)

Slave3

Master M

Slave1

Slave2

Slave3

3 2

Master M

Slave1

Slave2

Slave3

2->M 1->M

Master send command to slave

Message from S3 to S2

Message from S1 to S3

Message from S1 to S2

Message from S2 to M

Message from S3 to S1

3 1

Slave1

2 M

1-> 2

Slave2

1 M

Master M

Slave1

Slave2

Slave3

Slave1

Slave2

1->2 1->3

Slave3

Slave1

2 ->M

3-> 1 3-> 2

Master M (middle point)

Slave2

Slave3

Master M

Output buffers (Begin period)

Fig. 20.10 Polling data operation

furthermore, the data polling operation must ensure low access times to the bus for all units connected to it. The pooling operation is illustrated in Fig. 20.10, where the data flow through the bus is represented. The data polling operation is carried out in three different phases. (i) Slaves are sequentially scanned by the master searching for messages ready to be transferred. (ii) During the router phase the master inspects the field address from each message in the input buffer, if the message has a slave as the destination then it will be transferred to the output buffer, if the message has the master as the destination, then it will be sent to the user’s microcontroller. (iii) The data polling operation is finished by the master sending all messages in the output buffer to the respective slaves.

20.5 Conclusion This chapter presents the development of a platform to support the teaching of embedded systems. The modules here presented allow the implementation of different experimental laboratories, with increasing level of difficulty. At the same time, the student has access to key technologies related with the development of embedded systems. The presence of an expansion bus gives a high versatility to the learning platform, because, it allows the development of new modules. This versatility is further enhanced by the existence of a communication bus that turns possible data exchange between modules.

20

A Scalable Hardware Environment for Embedded Systems Education

245

References 1. Choi SH, Poon CH (2008) An RFID-based anti-counterfeiting system. IAENG Int J Comput Sci 35:1 2. Lin G-L, Cheng C-C (2008) An artificial compound eye tracking pan-tilt motion. IAENG Int J Comput Sci 35:2 3. Rover DT et al (2008) Reflections on teaching and learning in an advanced undergraduate course in embedded systems. IEEE Trans Educ 51(3):400 4. Ricks KG, Jackson DJ, Stapleton WA (2008) An embedded systems curriculum based on the IEEE/ACM model curriculum. IEEE Trans Educ 51(2):262–270 5. Nooshabadi S, Garside J (2006) Modernization of teaching in embedded systems design—an international collaborative project. IEEE Trans Educ 49(2):254–262 6. Ferens K, Friesen M, Ingram S (2007) Impact assessment of a microprocessor animation on student learning and motivation in computer engineering. IEEE Trans Educ 50(2):118–128 7. Hercog D et al (2007) A DSP-based remote control laboratory. IEEE Trans Ind Electron 54(6):3057–3068 8. Caspi P et al (2005) Guidelines for a graduate curriculum on embedded software and systems. ACM Trans Embed Comput Syst 4(3):587–611 9. Chen C-Y et al (2009) EcoSpire: an application development kit for an ultra-compact wireless sensing system. IEEE Embed Syst Lett 1(3):65–68 10. Dinis P, Espírito-Santo A, Ribeiro B, Santo H (2009) MSP430 teaching ROM. Texas Instruments, Dallas 11. Gonçalves T, Espírito-Santo A, Ribeiro BJF, Gaspar PD (2010) Design of a learning environment for embedded system. In: Proceedings of the world congress on engineering 2010, WCE 2010, 30 June–2 July, 2010, London, UK, pp 172–177 12. MSP430TM16-bit Ultra-Low Power MCUs, Texas Instruments. http://www.ti.com

Chapter 21

Yield Enhancement with a Novel Method in Design of Application-Specific Networks on Chips Atena Roshan Fekr, Majid Janidarmian, Vahhab Samadi Bokharaei and Ahmad Khademzadeh

Abstract Network on Chip (NoC) has been proposed as a new paradigm for designing System on Chip (SoC) which supports high degree of scalability and reusability. One of the most important issues in an NoC design is how to map an application on NoC-based architecture in order to satisfy the performance and cost requirements. In this paper a novel procedure is introduced to find an optimal application-specific NoC using Particle Swarm Optimization (PSO) and a linear function which considers communication cost, robustness index and contention factor. Communication cost is a common metric in evaluation of different mapping algorithms which have direct impact on power consumption and performance of the mapped NoC. Robustness index is used as a criterion for estimating faulttolerant properties of NoCs and contention factor highly affects the latency, throughput and communication energy consumption. The experimental results on two real core graphs VOPD and MPEG-4 reveal the power of proposed procedure to explore design space and how effective designer can customize and prioritize the impact of metrics.

A. R. Fekr (&) M. Janidarmian CE Department, Science and Research Branch, Islamic Azad University, Tehran, Iran e-mail: [email protected] M. Janidarmian e-mail: [email protected] V. S. Bokharaei ECE Department, Shahid Beheshti University, Tehran, Iran e-mail: [email protected] A. Khademzadeh Iran Telecommunication Research Center, Tehran, Iran e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_21, Ó Springer Science+Business Media B.V. 2011

247

248

A. R. Fekr et al.

21.1 Introduction Due to ever-increasing complexity of system on chip (SoC) design, and nonefficiency of electric bus to exchange data between IP cores in giga scale, the Network on Chip (NoC) is presented with more flexible, scalable and reliable infrastructure. Different mapping algorithms for NoCs are presented to decide which core should be linked to which router. Mapping an application to on-chip network is the first and the most important step in the design flow as it will dominate the overall performance and cost [1]. The main purpose of this study is to present a new method to generate a wide range of mappings with all reasonable values of communication cost. The most appropriate mapping is selected by total cost function using a linear function. The function can be customized by a designer, considering the impact of three key parameters, i.e., communication cost, robustness index and contention factor. The proposed procedure is shown in and explained in the next sections. Albeit the proposed approach is topology-independent, it is illustrated and evaluated for 2D mesh topology as it is widely used for most mapping algorithms.

21.2 Particle Swarm Optimization as a Mapping Generator Many mapping algorithms have been recently proposed to improve several parameters used in the NoC design. One of the most important parameters is the communication cost. There are several available mapping algorithms which are considered to minimize the communication cost. Using small hop counts between related cores will significantly drop the communication cost. Moreover, small hop counts will reduce the energy consumption and other performance metrics like latency [2]. It can be explained that reduction of hop counts can decrease the fault tolerant properties of NoC. Therefore, the optimal solution is to minimize the communication cost while maximizing the fault tolerant properties of NoC. In this paper, particle swarm optimization (PSO) algorithm is used to achieve the optimal solution. As a novel population-based swarm intelligent technique, PSO simulates the animal social behaviors such as birds flocking, fish schooling, etc. Due to the simple concept and ease implementation, it has gained much attention and many improvements have been proposed [3]. In a PSO system, multiple candidate solutions coexist and collaborate simultaneously. Each solution, called a ‘‘particle’’, flies in the problem space according to its own ‘‘experience’’ as well as the experience of neighboring particles. Different from other evolutionary computation algorithms, in PSO, each particle utilizes two information indexes: velocity and position, to search the problem space (Fig. 21.1).

21

Yield Enhancement with a Novel Method

249

Fig. 21.1 The proposed procedure to achieve the optimal application-specific Network-on-Chip

The velocity information predicts the next moving direction, as well as the position vector is used to detect the optimum area. In standard particle swarm optimization, the velocity vector is updated as follows:

250

A. R. Fekr et al.

Fig. 21.2 Particle swarm optimization algorithm

mjk ðt þ 1Þ ¼ wt mjk ðtÞ þ c1 r1 pjk ðtÞ Xjk ðtÞ þ c2 r2 pgk ðtÞ Xjk ðtÞ ;

ð21:1Þ

wtþ1 ¼ wt wdamp where mjk ðtÞ and xjk ðtÞ represent the kth coordinates of velocity and position vectors of particle j at time t, respectively. pjk ðtÞ means the kth dimensional value of the best position vector which particle j had been found, as well as pgk ðtÞ denotes the corresponding coordinate of the best position found by the whole swarm. Inertia weight, wt , cognitive coefficient, c1 ; and social coefficient, c2 , are three parameters controlling the size of velocity vector. r1 and r2 are two random numbers generated with normal distributions within interval [0,1]. With the corresponding velocity information, each particle flies according to the following rule (Eq. 21.2) [3]. This concept is shown in Fig. 21.2: xjk ðt þ 1Þ ¼ xjk ðtÞ þ mjk ðt þ 1Þ

ð21:2Þ

It is worth mentioning that onyx is one of the best mapping algorithms in terms of communication cost. Using Onyx result and considering the evolutionary nature of PSO, different mappings are created with a variety of communication costs. To do this, onyx result is injected into population initialization step as a particle as shown in Fig. 21.1b. In order to avoid rapid convergence, velocity threshold is not defined and c1 ; c2 ; w0 and wdamp are set to 3.49, 7.49, 1 and 0.99 respectively in the proposed PSO algorithm. These values were obtained by examining several simulations because they drastically affect on the diversity of results.

21.3 Experimental Results of Mapping Generator The real core graphs, VOPD and MPEG-4 [2], are used in the proposed PSO algorithm. The proposed PSO algorithm was run with 1000 initial population using 200 iterations. Figure 21.3a indicates the minimum, mean and maximum fitness function values in each iteration. As shown in Fig. 21.3b, it is clear that our PSO

21

Yield Enhancement with a Novel Method

251

Fig. 21.3 a Minimum, mean and maximum fitness function values for VOPD and MPEG-4 core graphs, b ability of the proposed mapping generator in producing mappings with all reasonable communication cost values

algorithm could generate different mappings of VOPD and MPEG-4 core graphs with all reasonable communication cost values because of mentioned convergence control. There are 119,912 and 156,055 different unique mappings for VOPD and MPEG-4 core graphs respectively. It is worth noting that this method, which is a novel approach, enables the designer to consider other important key parameters as well.

21.4 Robustness Index Robustness index is considered as a criterion for estimating fault tolerant properties of NoCs [4]. The greater the robustness index, the more fault tolerant NoC design. The robustness index RI, is based on the extension of the concept of path diversity [5]. For a given communication,ck 2 C; an NoC architecture graph, A(T, L), a mapping function, M, and a routing function, R, [4] defined the robustness index for communication ck ; RIðck Þ; as the average number of routing paths available for communication, ck , if a link belonging to the set of links used by communication ck is faulty. Formally, RIðck Þ ¼

1 X li;j2L qðck Þnqðck ; li;j Þ jLðck Þj

ð21:3Þ

252

A. R. Fekr et al.

where, qðck Þis the set of paths provided by R for communication, ck , qðck ; li;j Þ is the set of paths provided by R for communication, ck , that uses link li;j , and Lðck Þ is the set of links belonging to paths in qðck Þ. Suppose that there are two routing functions, A and B, which routing function A selects path1 and path2 and routing function B selects path2 and path3 to route packets between source and destination as shown in Fig. 21.1c. The routing function A selects two disjoint paths such that the presence of a faulty link in one path dose not compromise communication from source to destination since another path is fault-free. However, when the routing function B is used as shown in Fig. 21.1c, the communication will not occur. As the alternative paths share the link, l4 any fault in the link, l4 makes the communication from ‘‘source’’ to ‘‘destination’’ impossible. Consequently, the NoC which uses routing function A; NOC1 , is more robust than the NoC which uses routing function B, let call it NOC2 . Such situation is reflected by the robustness index. The robustness index for the above two cases are: RI ðNOC1 Þ ðSource ! destinationÞ ¼

1þ1þ1þ1þ1þ1 ¼ 1; 6

RI ðNOC2 Þ ðSource ! destinationÞ ¼

0þ1þ1þ1þ1 ¼ 0:8: 5

The NOC1 using path1 and path2 is more robust than the NOC2 using path2 and path3 for communication from ‘‘source’’to ‘‘destination’’as RI ðNOC1 Þ [ RI ðNOC2 Þ . The global robustness index, which characterizes the network, is calculated using the weighted sum of the robustness index of each communication. For a communication, ck , the weight of RIðck Þ is the degree of adaptivity [6] of ck . The degree of adaptivity of a communication, ck , is the ratio of the number of allowed minimal paths to the total number of possible minimal paths between the source node and the destination node associated to ck . The global robustness index is defined as Eq. 21.4. X aðck ÞRI ðNOCÞ ðck Þ ð21:4Þ RI ðNOCÞ ¼ ck 2C

where aðck Þ indicates the degree of adaptivity of communication ck . In this paper, one of the best algorithms which is customized for routing in application-specific NoCs, is used. The algorithm was presented in [7] which uses a highly adaptive deadlock-free routing algorithm. This routing algorithm has used Application-Specific Channel Dependency Graphs (ASCDG) concept to be freedom of dead-lock [8]. Removing cycles in ASCDG has great impact on parameters such as robustness index and is done by different methods. Therefore, in this paper, this step is skipped and left for the designer to use his preferable method.

21

Yield Enhancement with a Novel Method

253

21.5 Contention Factor In [9] a new contribution consist of an integer linear programming formulation of the contention-aware application mapping problem which aims at minimizing the inter-tile network contention was presented. This paper focuses on the network contention problem; this highly affects the latency, throughput and communication energy consumption. The source-based contention occurs when two traffic flows originating from the same source contend for the same links. The destination based contention occurs when two traffic flows which have the same destination contend for the same links. Finally the path-based contention occurs when two data flows which neither come from the same source, nor go towards the same destination contend for the same links somewhere in the network. The impact of these three types of contention was evaluated and observed that the path-based contention has the most significant impact on the packet latency. Figure 21.1d shows the path-based contention. So, in this paper we consider this type of contention as a factor of mappings. More formally: X Lðrmapðm Þ;mapðm Þ Þ \ Lðrmapðm Þ;mapðm Þ Þ Contention Factor ¼ i j k l 8ei;j 2E ð21:5Þ for i 6¼ k and j 6¼ l By having communication cost, robustness index and contention factor for each unique mapping, the best application-specific Network on Chip configuration should be chosen regarding to designer’s decisions.

21.6 Optimal Solution Using a Linear Function As previously mentioned, lower communication cost leads to an NoC with better metrics such as energy consumption and latency. Other introduced metrics were robustness index which is used as a measurable criterion for fault tolerant properties and contention factor which has the significant impact on the packet latency. A total cost function is to be introduced in order to minimize the sum of weighted these metrics (Fig. 21.1e). The total cost function is introduced as follows: d1 d2 d3 ðNOCÞ commcosti þ ðRIi Total Cost Function ¼ Min Þ þ CFI a b c 8 mappingi 2 generated mappings and di þ d2 þ d3 ¼ 1 ð21:6Þ

254

A. R. Fekr et al.

where, commcosti is the communication cost, RI ðNOCÞ is the robustness index and CFI is the contention factor of NoC after applying mappingi . The constants a, b and c are used to normalize the commcost, RI ðNOCÞ and CF. In this paper, a, b and c are set to the maximum obtained values for communication cost, robustness index and contention factor. d1 ; d2 and d3 are the weighting coefficients meant to balance the metrics. Although multi-objective evolutionary algorithms can be used to solve this problem, the proposed procedure is considered advantages in this study due to following reasons: First, a designer can change the weighting coefficients, without rerunning the algorithm. Second, due to the convergence control, the results are more diversified when compared to the multi-objective evolutionary algorithms and can be intensified by increasing the population size and/or iterations. And finally, if designer focuses on communication cost, the optimal communication cost does not usually occur in evolutionary algorithms.

21.7 Final Experimental Results In order to better investigate the capabilities of proposed procedure shown in Fig. 21.1, we have done some experiments on real core graphs VOPD and MPEG-4. As mentioned before, one of the advantages of proposed mapping generator is its diversity of produced solutions. Based on the experimental results, mentioned mapping generator produces 201,000 mappings for VOPD and MPEG-4, according to boundaries which limit population size and maximum iteration of PSO algorithm. Dismissing the duplicate mappings led to 119,912 and 156,055 unique mappings for VOPD and MPEG-4 which extracted among whole results. Results of running this procedure for VOPD and MPEG-4 core graphs and evaluating the values in the 3D design space are shown in Figs. 21.4, 21.5, 21.6, 21.7, 21.8, 21.9, 21.10, and 21.11. Values of d1 ; d2 and d3 which used in these experiments respectively are 0.5, 0.3 and 0.2 for VOPD core graph and 0.1, 0.2 and 0.7 for MPEG-4 core graph. As it can be seen in these figures, there are many different mappings which have the equal communication cost value that is one of the good points about proposed mapping generator. In average, there are almost 18 and 12 different mappings for each special value of communication cost while VOPD and MPEG-4 are considered as experimental core graphs. The optimal applicationspecific NoC configuration can be selected by setting proper values in total cost function based on designer demands. In our design, VOPD mapping with communication cost, 4347, robustness index, 54.28, and contention factor, 284, is the optimal solution. Mapping with communication cost, 6670.5, robustness index, 35.94, and contention factor, 6, is also the optimal solution for MPEG-4 mapping.

21

Yield Enhancement with a Novel Method

Fig. 21.4 Robustness index, contention factor and communication cost of VOPD mappings in 3D design space

Fig. 21.5 Communication cost, robustness index and total cost of VOPD mappings in 3D design space

Fig. 21.6 Communication cost, contention factor and total cost of VOPD mappings in 3D design space

Fig. 21.7 Robustness index, contention factor and total cost of VOPD mappings in 3D design space

255

256 Fig. 21.8 Robustness index, contention factor and communication cost of MPEG-4 mappings in 3D design space

Fig. 21.9 Communication cost, robustness index and total cost of MPEG-4 mappings in 3D design space

Fig. 21.10 Communication cost, contention factor and total cost of MPEG-4 mappings in 3D design space

Fig. 21.11 Robustness index, contention factor and total cost of MPEG-4 mappings in 3D design space

A. R. Fekr et al.

21

Yield Enhancement with a Novel Method

257

21.8 Conclusion As mapping is the most important step in Network-on-Chip design, in this paper a new mapping generator using Particle Swarm Optimization algorithm was presented. The best mapping in terms of communication cost was derived from Onyx mapping algorithm and injected into population initialization step as a particle. Because of using Onyx mapping results as particles, results convergence was controlled by finding appropriate values in velocity vector. This PSO algorithm is able to generate different mappings with all reasonable communication cost values. Using three metrics which are communication cost, robustness index and contention factor for each unique mapping, the best application-specific Networkon-Chip configuration can be selected regarding to designer’s demands that are applied onto the total cost function. Acknowledgments This chapter is an extended version of the paper [10] published at the proceedings of The World Congress on Engineering 2010, WCE 2010, London, UK.

References 1. Shen W, Chao C, Lien Y, Wu A (2007), A new binomial mapping and optimization algorithm for reduced-complexity mesh-based on-chip network. Networks-on-chip, NOCS, 7–9 May 2007, pp 317–322 2. Janidarmian M, Khademzadeh A, Tavanpour M (2009) Onyx: a new heuristic bandwidthconstrained mapping of cores onto tile based Network on Chip. IEICE Electron Express 6(1):1–72 3. Zhihua CUI, Xingjuan CAI, Jianchao ZENG (2009) Choatic performance-dependant particle swarm optimization. Int J Innov Comput Inf Control 5(4):951–960 4. Tornero R, Sterrantino V, Palesi M, Orduna JM (2009) A multi-objective strategy for concurrent mapping and routing in networks on chip. In: Proceedings of the 2009 IEEE international symposium on parallel & distributed processing, pp 1–8 5. Dally WJ, Towles B (2004) Principle and practice of interconnection network. Morgan Kaufmann, San Francisco 6. Glass CJ, Ni LM (1994) The turn model for adaptive routing. J Assoc Comput Mach 41(5):874–902 7. Palesi M, Longo G, Signorino S, Holsmark R, Kumar S, Catania V (2008) Design of bandwidth aware and congestion avoiding efficient routing algorithms for networks-on-chip platforms. In: Second ACM/IEEE international symposium on networks-on-chip, NoCS 2008, pp 97–106 8. Palesi M, Holsmark R, Kumar S (2006) A methodology for design of application specific deadlock-free routing algorithms for NoC systems, hardware/software codesign and system synthesis. CODES ? ISSS ‘06. In: Proceedings of the 4th international conference, pp 142–147 9. Chou C, Marculescu R (2009) Contention-aware application mapping for Network-on-Chip communication architectures computer design, 2008. IEEE international conference on ICCD 2008, vol 19, pp 164–169 10. Roshan Fekr A, Khademzadeh A, Janidarmian M, Samadi Bokharaei V (2010) Bandwidth/ fault tolerance/contention aware application-specific NoC using PSO as a mapping generator. In: Lecture notes in engineering and computer science: proceedings of the world congress on engineering 2010, WCE 2010, 30 June–2 July 2010, London, UK, pp 247–252

Chapter 22

On-Line Image Search Application Using Fast and Robust Color Indexing and Multi-Thread Processing Wichian Premchaisawadi and Anucha Tungkatsathan

Abstract The keyword-based images search engine like Google or Yahoo may return a large number of junk images which are irrelevant to the given keywordbased queries. In this paper, an interactive approach is developed to filter out the junk images from the keyword-based Yahoo image search results through Yahoo’ Boss API. The framework of multi-threaded processing is proposed to incorporate an image analysis algorithm into the text-based image search engines. It enhances the capability of an application when downloading images, indexing, and comparing the similarity of retrieved images from diverse sources. We also propose an efficient color descriptor technique for image feature extraction, namely, Auto Color Correlogram and Correlation (ACCC) to improve the efficiency of image retrieval system and reduce the processing time. The experimental evaluations based on the coverage ratio measure show that our scheme significantly improves the retrieval performance over the existing image search engines.

22.1 Introduction Most of the popular, commercial search engines, such as Google, Yahoo, and even the latest application, namely Bing, introduced by Microsoft, have achieved great success on exploiting the pure keyword features for the retrieval process of large-

W. Premchaisawadi (&) A. Tungkatsathan Graduate School of Information Technology in Business, Siam University, 38 Petkasem Rd., Phasi-Charoen, Bangkok, Thailand e-mail: [email protected] A. Tungkatsathan e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_22, Ó Springer Science+Business Media B.V. 2011

259

260

W. Premchaisawadi and A. Tungkatsathan

scale online image collections. Unfortunately, these image search engines are still unsatisfactory because of the relatively low precision rate and the appearance of large amounts of junk images [1]. One of several main reasons is that these engines don’t use visual signature of the image for image indexing and retrieval. The indexing images, retrieval processes, and similarity measure among images principally take a long computation time so that they aren’t suitable for real-time process optimization approach. There are many researchers who are trying to minimize computation time by applying distributed computing, for instance cluster computing to reduce the computational time [2–7]. Lu et al. presented a parallel technique to perform feature extraction and a similarity comparison of visual features, developed on cluster architecture. The experiments conducted show that a parallel computing technique can be applied that will significantly improve the performance of a retrieval system [2]. Kao, et al. proposed a cluster platform, which supports the implementation of retrieval approaches used in CBIR systems. Their paper introduces the basic principles of image retrieval with dynamic feature extraction using cluster platform architecture. The main focus is workload balancing across the cluster with a scheduling heuristic and execution performance measurements of the implemented prototype [3]. Ling and Ouyang proposed a parallel algorithm for semantic concept mapping, which adopts two-stage concept searching method. It increases the speed of computing the low-level feature extraction, latent semantic concept model searching and bridging relationship between image low-level feature and global sharable ontology [4]. Kao presents techniques for parallel multimedia retrieval by considering an image database as an example. The main idea is a distribution of the image data over a large number of nodes enables a parallel processing of the compute intensive operations for dynamic image retrieval. However, it is still a partitioning of the data and the applied strategies for workload balancing [5]. Although, cluster computing is popularly used in images retrieval approaches, it only attacks this problem at the macro level. Especially, to design a distributed algorithm and program it with cross-platform capability is difficult. In contrast, this paper is concerned with the micro level aspect of the problem by using multi-threading. Multi-threading is not the same as distributed processing. Distributed processing which is sometimes called parallel processing and multi-threading are both techniques used to achieve parallelism (and can be used in combination) [8]. Fortunately, with the increasing computational power of modern computers, some of the most time-consuming tasks in image indexing and retrieval are easily parallelized, so that the multi-core architecture in modern CPUs and multithreaded processing may be exploited to speed up image processing tasks. Moreover, it is possible to incorporate an image analysis algorithm into the textbased image search engines such as Google, Yahoo, and Bing without degrading their response time significantly [9]. We also presents modify advanced algorithm, namely auto color correlogram and correlation (ACCC) [10] based on a color correlogram (CC) [11], for extracting and indexing low-level features of images. The framework of multi-threaded processing for an on-line CBIR application is

22

On-Line Image Search Application

261

proposed. It enhances the capability of an application when downloading images and comparing the similarity of retrieved images from diverse sources. Section 22.2 presents the framework of an on-line image retrieval system with multithreading. Section 22.3 discusses the proposed indexing technique in older to speed up image processing tasks. The experimental study is presented in Sect. 22.4 and concluding remarks are set out in Sect. 22.5.

22.2 The Proposed Framework of Multithreading for an On-Line CBIR System Before introducing our framework of multi-threading for an on-line CBIR application, we will briefly examine the properties of the queries to be answered. We have developed a novel framework of real-time processing for an on-line CBIR application, using relevance images from Yahoo images. Our method uses the following major steps: (a) Yahoo Images is first used to obtain a large number of images that are returned for a given text-based query; (b) The users select a relevance image and a user’s feedback is automatically collected to update the input query for image similarity characterization; (c) A multi-threaded processing method is used to manage and perform data parallelism or loop-level parallelism such as downloading images, extraction of visual features and computation of visual similarity measures; (d) If necessary, users can also change a keyword before selecting a relevance image for the query; (e) The updated queries are further used to adaptively create a new answer for the next set of returned images according to the users’ personal preferences (see Fig. 22.1). In this section, a multi-threaded processing method is used to carry out parallel processing of multiple threads for a specific purpose. Multi-threading is a way to let programs do more than one thing at a time, implemented within a single program, and running on a single system. The number of threads should be considered and they must technically be assigned to the correct parts of the program in order to utilize the threads more efficiently. The development of functions, classes, and objects in the program should logically be designed as a sequence of steps. In this research, we firstly use the threads to improve the downloading speed for images from various sources according to the locations specified in the .xml file that are returned from Yahoo BOSS API [12]. Second, they increase the speed of computing the image feature extraction and similarity measure of feature vectors. The framework of multi-thread processing is presented in Fig. 22.2. The thread control and the tasks insight of a thread for retrieving images are presented in Figs. 22.3 and 22.4. An image list control receives the .xml files that are returned from Yahoo BOSS API. The lists of URL can be obtained from the .xml files. They are further displayed and used for downloading images from the hosts. An image download module is designed to work in a multithreaded process for downloading images from diverse sources. It is controlled by an image search control module. The image search control module performs a very important function in the

262

W. Premchaisawadi and A. Tungkatsathan

Enter keyword

Image from Yahoo database

Image query

“Apple”

Similar Image

1

2

3

4

5

6

7

8

Fig. 22.1 Basic principles of the proposed system

management of the system. It fully supports and controls all modules of the online CBIR system. It checks for errors, and the input/output status of each module. Most importantly, it efficiently supports the synchronization of multiple threads that performs image download and similarity measurement by the associated modules. The similarity measurement module performs the computation of the feature vectors and distance metrics of all images that are obtained from the image download module. The image download and similarity measurement modules work concurrently. The query results are recorded into a session of an array in sequential order. The image list object is responsible for the arrangement of all displayed images on the application.

22.3 Feature Computation This paper’s main focus is on parallel computing techniques for image retrieval. The main objective is to reduce the processing time of real-time a CBIR system. However, an efficient color descriptor technique for image feature extraction is still required to reduce the processing time. In this section, we present an efficient algorithm for the proposed framework. It is a modifying of the correlogram technique for color indexing. An auto color correlation (ACC) [10] expresses how to compute the mean color of all pixels of color Cj at a distance kth from a pixel of color Cj in the image. Formally, the ACC of imagefI ðx; yÞ; x ¼ 1; 2; . . .; M; y ¼ 1; 2; . . .; Ng is defined by Eq. 1.

22

On-Line Image Search Application

263 CBRI application

IMAGES LIST Image Result [XML Format]

Parser XML Image LIST

KEYWORD SEARCH

URL List of source Image

KEYWORD

BOSS API YAHOO!

KEYWORD

Images

KEYWORD

IMAGE SEARCH CONTROL

Files Name

End Status

Message Download Synchronous

Image Result [XMLFormat] Images List

FEATURE EXTRACTION AND SIMILARITY MEASUREMENT

IMAGES DOWNLOAD Thread Control

Thread Control

Thread Download

Thread Download

Thread Download

Thread Download

Thread Download

Thread Feature Extraction and Comparasion

Thread Feature Extraction and Comparasion

Thread Feature Extraction and Comparasion

SIMILARITY RESULT ARRAY Sorting Module

SIMILARITY IMAGES LIST

Result List

Separate Result to Pages Image List from Link original’s image

Fig. 22.2 The framework of in real-time multi-threaded processing for an on-line CBIR application

ACCði; j; kÞ ¼ MCj cðkÞ ci cj ðIÞ n o ðkÞ ðkÞ ¼ rmcj cðkÞ ci cj ðIÞ; gmcj cci cj ðIÞ; bmcj cci cj ðIÞjci 6¼ cj

ð22:1Þ

where the original image I ðx; yÞ is quantized to m colors C1 ; C2 ; . . .; Cm and the distance between two pixels d 2 ½minfM; N g is fixed a priori. Let MCj is the color mean of the total number of color Ci from color Ci at distance kth in an image I. The arithmetic mean colors are computed by Eq. 22.2.

264

W. Premchaisawadi and A. Tungkatsathan

Fig. 22.3 The thread control and the tasks insight of the thread for downloading images

Wait/Sleep/ Join/Timeout Thread Blocks

Thread Unblocks Abort

Running Download Image

Start Unstarted

Abort Image Preparation

Abort Requested Reset Abort

Save Image

Thread Ends

Thread Ends Stopped

Fig. 22.4 The thread control and the tasks insight of the thread for retrieving images

22

On-Line Image Search Application

265

rmcj cðkÞ ci cj ðIÞ

¼

gmcj cðkÞ ci cj ðIÞ ¼

CðkÞ ci ;rcj ðIÞ CðkÞ ci ;cj ðIÞ CðkÞ ci ;gcj ðIÞ CðkÞ ci ;cj ðIÞ

jci 6¼ cj jci 6¼ cj

ð22:2Þ

ðkÞ

bmcj cðkÞ ci cj ðIÞ ¼

Cci ;bcj ðIÞ CðkÞ ci ;cj ðIÞ

jci 6¼ cj

The denominator CðkÞ ci ;xcj ðIÞ is the total of pixels color values of color Cj at distance k from any pixel of color Ci when xCj is RGB color space of color Cj and denoted Cj 6¼ 0: N is the number of accounting color Cj from color Ci at distance k, defined by Eq. 22.3. ( N ¼ Ckci ;cj ðIÞ ¼

Pðx1 ; y1 Þ 2 Ci jPðx2 ; y2 Þ 2 Cj ; k ¼ minfjx1 x2 j; jy1 y2 jg

) ð22:3Þ

We propose an extended technique of ACC based on the autocorrelogram, namely Auto Color Correlogram and Correlation (ACCC). It is the integration of Autocorrelogram [5] and Auto Color Correlation techniques [10]. However, the size of ACCC is still O(md). The Auto Color Correlogram and Correlation is defined by Eq. 22.4. n o ðkÞ ðIÞ; MC c ðIÞ ACCCðj; j; kÞ ¼ cðkÞ j ci cj ci

ð22:4Þ

Let the ACCC pairs for the m color bin be ðai ; bi Þ in I and ða0i ; b0i Þ in I0 . The similarity of the images is measured as the distance between the AC’s and ACC’s dðI; I 0 Þ, which are derived from Lee et al. [13]. It is shown by Eq. 22.5 ( 0

dðI; I Þ ¼

k1

X 8i

) ai a0 X bi b0i i þ k2 0:1 þ ai þ a0i 0:1 þ bi þ b0i 8i

ð22:5Þ

The k1 and k2 are the similarity weighting constants of autocorrelogram and auto color correlation, respectively. In the experiments conducted, k1 ¼ 0:5 and a1 and a2 are defined by Eq. 22.6. The detail of ACC and ACCC algorithms are presented in Tungkastsathan and Premchaisawadi [10]. ai ¼ cðkÞ c ðIÞ ni o ðkÞ ðkÞ bi ¼ rmcj cðkÞ ci cj ðIÞ; gmcj cci cj ðIÞ; bmcj cci cj ðIÞjci 6¼ cj

ð22:6Þ

266

W. Premchaisawadi and A. Tungkatsathan

22.4 Experiment and Evaluation The experiments that were performed are divided into two groups: In group 1, we evaluated the retrieval rate for on-line Yahoo image data sets in term of user relevance. And in group 3, we studied the performance of multi-thread processing in term of data parallelism for real-time image retrieval tasks.

22.4.1 Evaluated the Retrieval Rate We have implemented an on-line image retrieval system using the Yahoo image database based on the Yahoo BOSS’ API. The application is developed by using Microsoft .NET and implemented in the Windows NT environment. The goal of this experiment is to show that relevant images can be found after a small number of iterations, the first round is used in this experiment. From the viewpoint of user interface design, precision and recall measures are less appropriate for assessing an interactive system [14]. To evaluate the performance of the system in terms of user feedback, user-orientation measures are used. There have been other design factors proposed such as relative recall, recall effort, coverage ratio, and novelty ratio [15]. In this experiment the coverage ratio measure is used. Let R be the set of relevant images of query q and A be the answer set retrieved. Let jU j be the number of relevant images which are known to the user, where U 2 R. The coverage ratio is the intersection of the set A and U, jRk j be the number of images in this set. It is defined by Eq. 22.7. CoverageðCq Þ ¼

jRk j U

ð22:7Þ

Let WðqÞ is the number of keyword used. The average of coverage ratio is by Eq. 22.8. NðqÞ

CðqÞ

1 X jRk j ¼ NðqÞ i¼1 jUj

ð22:8Þ

To conduct this experiment, Yahoo Images is first executed to obtain a large number of images returned by a given text-based query. The user selects a relevant image, specific to only one interaction with the user. Those images that are most similar to the new query image are returned. The retrieval performance in term of coverage ratio of the proposed system is compared to the traditional Yahoo textbased search results. The average coverage ratio is generated based on the ACC and ACCC algorithms using over 49 random test keywords in heterogeneous categories (i.e. animal, fruit, sunset, nature, and landscape). The results are presented in Table 22.1.

22

On-Line Image Search Application

267

Table 22.1 Coverage ratio average of the top 24 of 200 retrieved images Sample images Coverage ratio Sample 1 Sample 2 Avg. Text-based

Animal

Fruit

Sunset/sunrise

Nature

Landscape

0.71 0.65 0.68 0.42

0.79 0.71 0.75 0.32

0.62 0.65 0.63 0.58

0.64 0.59 0.62 0.36

0.69 0.65 0.67 0.43

The data in a Table 22.1 shows that a user’s feedback using a keyword with the ACCC algorithm can increase the efficiency of image retrieval from the Yahoo image database. Using the combination of text and a user’s feedback for an image search, the images that do not correspond with the category are filtered out. It also decreases the opportunity of the images in other categories to be retrieved. In the experiment, we used two sample images obtained from the keyword search to test querying images for evaluating the performance of the system. The screenshots of the online image search application are shown in Figs. 22.5 and 22.6, respectively.

Fig. 22.5 Query results using a keyword search

268

W. Premchaisawadi and A. Tungkatsathan

Fig. 22.6 Query results after applying a relevant feedback

22.4.2 Performance of Multithreading in the Image Retrieval Tasks In the experimental settings, we used one keyword for downloading two hundred images and performed the image search in the same environment (internet speed, time for testing, hardware and software platforms). We tested the application by using 49 keywords in heterogeneous categories (i.e. animal, fruit, sunset, nature, and landscape). We tested the image search for three times in each keyword and calculated the average processing time of the whole process for an on-line image retrieval task. The number of downloaded images for each keyword had a maximum error value, which was less than ten percent of total downloaded images. The threads were tested and run on two different hardware platform specifications, single-core and multi-core CPUs. The hardware specifications are described as follows. (1) Pentium IV singlecore 1.8 GHz, and 1 GB RAM DDR2 system, (2) Quad-Core Intel Xeon processor E5310 1.60 GHz, 1066 MHz FSB 1 GB (2 9 512 MB) PC2-5300 DDR2. The number of threads versus time on single-core and multi-core CPUs for an image retrieval process that includes image downloading, feature extraction and image comparison, which are shown in our previous work [16]. We can conclude that the processing time for the same amount of threads in each platform for an image retrieval task is different (see in Figs. 22.7 and 22.8). However, we selected the most suitable number of threads from the tests on each platform to determine the assumptions underlying a hypothesis test. The results are shown in Table 22.2.

22

On-Line Image Search Application

269

Serialized

5 threads

25 threads

5 threads

10 threads

700

600

Time(Sec.)

500

400

300

200

100

0

1

4

7

10

13

16

19

22

25

28

31

34

37

40

43

46

49

Number of Queries

Fig. 22.7 Number of treads versus time on multi-core in all processes for online CBIR system [16]

Serialized

5 threads

25 threads

50 threads

10 threads

601

Time(Sec.)

501 401 301 201 101 1

1

4

7

10

13

16

19

22

25

28

31

34

37

40

43

46

49

Number of Queries

Fig. 22.8 Number of threads versus time on multi-core in all processes for online CBIR system [16]

270

W. Premchaisawadi and A. Tungkatsathan

Table 22.2 The average time in second of a whole process, image downloading, feature extraction, and image comparison at suitable number of threads in each platform (mean ± stddev) W(q) S-core 10 threads Q-core 50 threads W(q) S-core 10 threads Q-core 50 threads 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

159.6 ± 3.3 165.6 ± 13.5 165.7 ± 15.1 148.7 ± 18.9 155.6 ± 4.9 150.0 ± 13.1 159.3 ± 8.5 158.0 ± 21.4 148.3 ± 17.6 160.0 ± 21.9 158.3 ± 5.7 162.7 ± 10.9 155.3 ± 3.4 157.7 ± 3.1 149.3 ± 4.5 152.0 ± 2.3 167.7 ± 18.6 174.7 ± 15.1 171.3 ± 7.6 162.0 ± 10.0 163.7 ± 3.7 162.3 ± 8.3 162.0 ± 4.2 156.3 ± 16.0 154.7 ± 8.9

57.0 ± 5.7 57.3 ± 6.9 65.7 ± 3.9 70.0 ± 6.2 75.3 ± 7.1 53.3 ± 1.7 60.7 ± 1.2 65.7 ± 2.5 61.3 ± 5.4 67.0 ± 3.6 71.0 ± 2.2 66.3 ± 4.0 63.7 ± 1.2 62.0 ± 1.2 58.7 ± 7.5 61.3 ± 2.5 66.0 ± 7.8 66.0 ± 4.9 58.0 ± 4.1 71.0 ± 9.2 59.3 ± 4.5 58.0 ± 7.1 56.0 ± 5.1 72.3 ± 10.2 64.3 ± 11.6

26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 Avg

153.6 ± 9.2 156.6 ± 4.9 157.6 ± 13.1 159.0 ± 11.4 161.3 ± 11.8 160.3 ± 7.6 165.6 ± 16.9 153.0 ± 7.5 153.3 ± 12.6 150.0 ± 10.7 158.0 ± 11.3 159.6 ± 11.4 154.0 ± 5.9 157.0 ± 13.4 156.6 ± 8.9 165.0 ± 11.1 164.3 ± 9.7 149.3 ± 11.1 138.3 ± 6.2 148.6 ± 4.0 150.0 ± 9.9 146.0 ± 7.5 145.0 ± 8.5 150.0 ± 10.4 157.1 ± 8.4

61.0 ± 5.1 64.3 ± 5.8 68.3 ± 3.9 66.0 ± 7.8 56.7 ± 1.7 51.3 ± 4.5 63.3 ± 4.2 69.3 ± 3.9 59.0 ± 4.5 63.6 ± 6.0 63.7 ± 2.1 60.7 ± 6.8 61.0 ± 2.9 65.7 ± 2.5 62.3 ± 4.2 53.7 ± 4.8 56.7 ± 2.6 56.3 ± 3.1 64.3 ± 7.8 58.0 ± 0.8 57.7 ± 6.8 57.0 ± 4.3 60.7 ± 5.7 54.0 ± 3.3 62.0 ± 7.3

We formulated the hypothesis based on the experiment by using the statistical ttest. We did a t-test on the 49 keywords for retrieving images in order to measure the significance of the complete processing time obtained after applying our proposed scheme (see in Table 22.2). The mean processing times of single-core and multi-core platforms are 157.12 ± 8.4 and 62.0 ± 7.3, respectively. Using the t-test to compare the means of two independent CPU platform specifications, the P values obtained from the t-test of single-core versus multi-core is 1.98e-25. A statistical test shows that a multi-core platform significantly consumes less processing time than that of the single-core platform.

22.5 Conclusions This research presents an interactive approach to filter out the junk images from the keyword-based Yahoo image search results. The advanced spatial color descriptors, namely; auto color correlation (ACC) and auto color correlogram

22

On-Line Image Search Application

271

and correlation (ACCC), are proposed. In order for the processing time of feature computation to be reduced, the multi-threaded processing method is also proposed. The coverage ratio measure is used to evaluate the retrieval performance of the user’s relevance feedback. Experiments on diverse keyword-based queries from Yahoo Images search engine obtained very positive results. Additionally, the experimental results show that our proposed scheme can speed up of the processing time for feature extraction and image similarity measurement as well as images downloading from various hosts. The use of multiple threads can significantly improve the performance of image indexing and retrieval on both platforms. In the future work based on this study, the distributed processing and multithreading will be considered in combination to achieve the parallelism.

References 1. Yuli G, Jinye P, Hangzai L, Keim DA, Jianping F (2009) An interactive approach for filtering out junk images from keyword based Google search results. IEEE Trans Circuits Syst Video Technol 19(12):1–15 2. Lu Y, Gao P, Lv R, Su Z, Yu W (2007) Study of content-based image retrieval using parallel computing technique. In: Proceedings of the 2007 Asian technology information program’s (ATIP’s), 11 November–16 November 2007, China, pp 186–191 3. Kao O, Steinert G, Drews F (2001) Scheduling aspects for image retrieval in cluster-based image databases. In: Proceedings of first IEEE/ACM. Cluster computing and the grid, 15 May–18 May 2001, Brisbane, Australia, pp 329–336 4. Ling Y, Ouyang Y (2008) Image semantic information retrieval based on parallel computing. In: Proceeding of international colloquium on computing, communication, control, and management, CCCM, 3 August–4 August 2008, vol 1, pp 255–259 5. Kao O (2001) Parallel and distributed methods for image retrieval with dynamic feature extraction on cluster architectures. In: Proceedings of 12th international workshop on database and expert systems applications, Munich, Germany, 3 September 2001–7 September 2001, pp 110–114 6. Pengdong G, Yongquan L, Chu Q, Nan L, Wenhua Y, Rui L (2008) Performance comparison between color and spatial segmentation for image retrieval and its parallel system implementation. In: Proceedings of the international symposium on computer science and computational technology, ISCSCT 2008, 20 December–22 December 2008, Shanghai, China, pp 539–543 7. Town C, Harrison K (2010) Large-scale grid computing for content-based image retrieval. Aslib Proc 62(4/5):438–446 8. Multi-threading in IDL. http://www.ittvis.com/ 9. Gao Y, Fan J, Luo H, Satoh S (2008) A novel approach for filtering junk images from Google search results. In: Lecture notes in computer science: advances in multimedia modeling, vol 4903, pp 1–12 10. Tungkastsathan A, Premchaisawadi W (2009) Spatial color indexing using ACC algorithms. In: Proceeding of the international conference on ICT and knowledge engineering, 1 December–2 December 2009, Bangkok, Thailand, pp 113–117 11. Huang J, Kumar SR, Mitra M, Zhu W-J (1998) Spatial color indexing and applications. In: Proceeding of sixth international conference on computer vision, 4 January–7 January 1998, Bombay, India, pp 606–607 12. Yahoo BOSS API. http://developer.yahoo.com/search/boss/

272

W. Premchaisawadi and A. Tungkatsathan

13. Lee HY, Lee HK, Ha HY, Senior member, IEEE (2003) Spatial color descriptor for image retrieval and video segmentation. IEEE Trans Multimed 5(3):358–367 14. Ricardo B-Y, Berthier R-N (1999) Modern information retrieval. ACM Press Book, New York 15. Robert RK (1993) Information storage and retrieval. Wiley, New York 16. Premchaisawadi W, Tungkatsathan A (2010) Micro level attacks in real-time image processing for an on-line CBIR system. In: Lecture notes in engineering and computer science: proceedings of the world congress on engineering 2010, WCE 2010, 30 June–2 July 2010, London, UK, pp 182–186

Chapter 23

Topological Mapping Using Vision and a Sparse Distributed Memory Mateus Mendes, A. Paulo Coimbra and Manuel M. Crisóstomo

Abstract Navigation based on visual memories is very common among humans. However, planning long trips requires a more sophisticated representation of the environment, such as a topological map, where connections between paths are easily noted. The present approach is a system that learns paths by storing sequences of images and image information in a sparse distributed memory (SDM). Connections between paths are detected by exploring similarities in the images, using the same SDM, and a topological representation of the paths is created. The robot is then able to plan paths and switch from one path to another at the connection points. The system was tested under reconstitutions of country and urban environments, and it was able to successfully map, plan paths and navigate autonomously.

23.1 Introduction About 80% of all the information humans rely on is visual [4], and the brain operates mostly with sequences of images [5]. View sequence based navigation is also extremely attractive for autonomous robots, for the hardware is very M. Mendes (&) ESTGOH, Polytechnic Institute of Coimbra, R. General Santos Costa, 3400-124 Oliveira do Hospital, Portugal e-mail: [email protected] M. Mendes A. P. Coimbra M. M. Crisóstomo Institute of Systems and Robotics, Pólo II, University of Coimbra, 3000, Coimbra, Portugal e-mail: [email protected] M. M. Crisóstomo e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_23, Ó Springer Science+Business Media B.V. 2011

273

274

M. Mendes et al.

straightforward, and the approach is biologically plausible. However, while humans are able to navigate quite well based only on visual information, images usually require huge computer processing power. This means that for real time robot operation, visual information is often avoided. Other sensors, such as sonar or laser range finders, provide accurate information at a much lower computational cost. The goal of equipping robots with cameras and vision-based navigation is still an open research issue. The use of special landmarks (possibly artificial, such as barcodes or data matrices) is a trick that can greatly improve the accuracy of the system [13]. As for the images, there are two popular approaches: one that uses plain images [6], the other that uses panoramic images [8]. Panoramic images offer a 360° view, which is richer than a plain front or rear view. However, that richness comes at the cost of even additional processing power requirements. Besides, the process of acquiring panoramic images requires the use of parabolic mirrors, which also introduce some distortion in the images. Some authors have also proposed techniques to speed up processing and/or reduce memory needs. Matsumoto [7] uses images as small as 32 9 32 pixels. Ishiguro [2] replaced the images by their Fourier transforms. Winters [15] compresses the images using Principal Component Analysis. All those techniques improve the processing time and/or efficiency of image processing in real time, contributing to make robot navigation based on image processing more plausible. The images alone are a means for instantaneous localisation. View-based navigation is almost always based on the same idea: during a learning stage the robot learns a sequence of views and motor commands that, if followed with minimum drift, will lead it to a target location. By following the sequence of commands, possibly correcting the small drifts that may occur, the robot is later able to follow the learnt path on its own. The idea is very simple and it works very well for single paths. However, it is not versatile and requires that all the paths are taught one by one. For complex trips and environments, that may be very time consuming. The process can be greatly simplified using topological maps and path planning algorithms. To plan paths efficiently, switching from one path to another at connection nodes, when necessary, more sophisticated representations of the environment are required than just plain images of sampling points. Those representations are provided by metric or topological maps [12]. Those maps represent paths and connections between them. They are suitable to use with search algorithms such as A*, for implementing intelligent planning and robot navigation. This paper explains how vision-based navigation is achieved using a sparse distributed memory (SDM) to store sequences of images. The memory is also used to recognise overlaps of the paths and thus establish connection nodes where the robot can switch from one path to another. That way, a topological representation of the world can be constructed, and the system can plan paths. Part of this work has already been published in [9]. Section 23.2 explains navigation based on view sequences in more detail. Section 23.3 explains how the SDM works. In Sect. 23.4 the robot platform used for the experiments is described. Section 23.5 describes the navigation algorithm, and Sect. 23.6 shows and discusses the results obtained.

23

Topological Mapping Using Vision

275

23.2 Navigation Using View Sequences Usually, the vision-based approaches for robot navigation are based on the concept of a ‘‘view-sequence’’ and a look-up table of motor commands, where each view is associated with a corresponding motor command that leads the robot towards the next view in the sequence. In the present work, the approach followed is very similar to that of Matsumoto et al. [7]. That approach requires a learning stage, during which the robot must be manually guided. While being guided, the robot memorises a sequence of views automatically. While autonomously running, the robot performs automatic image based localisation and obstacle detection, taking action in real-time. Localisation is estimated based on the similarity of two views: one stored during the learning stage and another grabbed in real-time. The robot tries to find matching areas between those two images, and calculates the horizontal distance between them in order to infer how far it is from the correct path. That distance is then used to correct possible drifts to the left or to the right. The technique is described in more detail in [10].

23.3 Sparse Distributed Memories The sparse distributed memory is an associative memory model proposed by Kanerva in the 1980s [5]. It is suitable to work with high dimensional binary vectors. In the present work, an image can be regarded as a high-dimensional vector, and the SDM can be used simultaneously as a sophisticated storage and retrieval mechanism and a pattern-matching tool.

23.3.1 The Original Model The underlying idea behind the SDM is the mapping of a huge binary memory onto a smaller set of physical locations, called hard locations. As a general guideline, those hard locations should be uniformely distributed in the virtual space, to mimic the existence of the larger virtual space as accurately as possible. Every datum is stored by distribution to a set of hard locations, and retrieved by averaging those locations and comparing the result to a given threshold. Figure 23.1 shows a model of a SDM. ‘‘Address’’ is the reference address where the datum is to be stored or read from. It will activate all the hard locations within a given access radius, which is predefined. Kanerva proposes that the Hamming distance, that is the number of bits in which two binary vectors are different, be used as the measure of distance between the addresses. All the locations that differ less than a predefined number of bits from the input address are selected for the read or write operation. In the figure,

276

M. Mendes et al.

Fig. 23.1 One model of a SDM, using bit counters

the first and the third locations are selected. They dist, respectively, 2 and 3 bits from the input address, and the activation radius is exactly 3 bits. Data are stored in arrays of counters, one counter for every bit of every location. Writing is done by incrementing or decrementing the bit counters at the selected addresses. To store 0 at a given position, the corresponding counter is decremented. To store 1, it is incremented. Reading is done by averaging the values of all the counters columnwise and thresholding at a predefined value. If the value of the sum is below the threshold, the bit is zero, otherwise it is one. Initially, all the bit counters must be set to zero, for the memory stores no data. The bits of the address locations should be set randomly, so that the addresses would be uniformely distributed in the addressing space. There is no guarantee that the data retrieved is exactly the same that was written. It should be, providing that the hard locations are correctly distributed over the binary space and the memory has not reached saturation.

23.3.2 The Model Used The original SDM model, though theoretically sound and attractive, has some faults. One problem is that of selecting the hard locations at random in the beginning of the operation. Another problem is that of using bit counters, which cause a very low storage rate of about 0.1 bits per bit of traditional computer memory and slow down the system. Those problems have been thoroughly described in [11], where the authors study alternative architectures and methods of encoding the data. To overcome the problem of placing hard locations in the address space, in the present work the hard locations are selected using the Randomised Reallocation algorithm proposed by Ratitch and Precup [14]. The idea is that the system starts with an empty memory and allocates new hard locations when there is a new datum which cannot be stored in enough existing locations. The new locations are placed randomly in the neighbourhood of the new datum address. To overcome the problem of using bit counters, the bits are grouped as integers, as shown in Fig. 23.2. Addressing is done using an arithmetic distance, instead of the

23

Topological Mapping Using Vision

277

Fig. 23.2 Alternative architecture of the SDM, auto-associative and using integer numbers

Fig. 23.3 Robot used

Hamming distance. Learning is achieved through the use of a gradient descent approach, updating each byte value using the equation: hkt ¼ hkt1 þ a ðxk hkt1 Þ;

a 2 R ^ 0a1

ð23:1Þ

The value hkt is the kth integer number in the hard location h; at time t: The value xk is the corresponding kth integer number in the input vector x: The coefficient a is the learning rate—in this case it was set to 1, enforcing one shot learning.

23.4 Experimental Platform The robot used was a Surveyor SRV-1, a small robot with tank-style treads and differential drive via two precision DC gearmotors (Fig. 23.3). Among other features, it has a built in digital video camera and a 802.15.4 radio communication module. This robot was controlled in real time from a laptop with a 1.8 GHz processor and 1 Gb RAM. The overall software architecture is as shown in Fig. 23.4. It contains three basic modules: 1. The SDM, where the information is stored. 2. The Focus (following Kanerva’s terminology), where the navigation algorithms are run.

278

M. Mendes et al.

Fig. 23.4 Architecture of the implemented software

3. An operational layer, responsible for interfacing the hardware and some tasks such as motor control, collision avoidance and image equalisation. Navigation is based on vision, and has two modes: supervised learning, in which the robot is manually guided and captures images to store for future reference; and autonomous running, in which it uses previous knowledge to navigate autonomously, following any sequence previously learnt. The vectors stored in the SDM consist of arrays of bytes, as summarised in Eq. 23.2: xi ¼ himi ; seq id; i; timestamp; motioni

ð23:2Þ

In the vector, imi is the image i; in PGM (Portable Gray Map) format and 80 64 resolution. In PGM images, every pixel is represented by an 8-bit integer. The value 0 corresponds to a black pixel, the value 255 represents a white pixel. seq id is an auto-incremented, 4-byte integer, unique for each sequence. It is used to identify which sequence the vector belongs to. The number i is an auto-incremented, 4-byte integer, unique for every vector in the sequence, used to quickly identify every image in the sequence. The timestamp is a 4-byte integer, storing Unix timestamp. It is not being used so far for navigation purposes. The character motion is a single character, identifying the type of movement the robot performed after the image was grabbed. The image alone uses 5,120 bytes. The overhead information comprises 13 additional bytes. Hence, the input vector contains a total of 5,133 bytes.

23.5 Mapping and Planning The ‘‘teach and follow’’ approach per se is very simple and powerful. But for robust navigation and route planning, it is necessary to extend the basic algorithm to perform additional tasks. For example, it is necessary to detect connection points between the paths learnt, when two or more paths cross, come together or split apart. It is also necessary to disambiguate when there are similar images or divergent paths.

23

Topological Mapping Using Vision

279

Fig. 23.5 Example of paths that have a common segment. The robot only needs to learn AB once

23.5.1 Filtering Out Unnecessary Images During learning in vision-based navigation, not every single picture needs to be stored. There are scenarios, such as corridors, in which the views are very similar for a long period of time. Those images do not provide data useful for navigation. Therefore, they can be filtered out during the learning stage, so that only images which are sufficiently different from their predecessors must be stored. That behaviour can be easily implemented using the SDM: every new image is only stored if there is no image within a predefined radius in the SDM. If the error in similarity between the new image and any image in the SDM is below a given threshold, the new image is discarded. A good threshold to use for that purpose is the memory activation radius. Because of the way the SDM works, new images that are less than an activation radius from an already stored image will be stored in the same hard locations. Therefore, they are most probably unnecessary, and can be discarded with no risk of impairing the performance of the system.

23.5.2 Detecting Connection Points Another situation in which new images do not provide useful information is the case when two paths have a common segment, such as depicted in Fig. 23.5. The figure shows two different paths, 1 and 2, in which the segment AB is common. If the robot learns segment AB for path 1, for example, then it does not need to learn it again for segment 2. When learning path number 2, it only needs to learn it until point A. Then it can store an association between paths 1 and 2 at point A and skip all the images until point B. At point B, it should again record a connection between paths 1 and 2. That way, it builds a map of the connection points between the known paths. That is a kind of topological representation of the environment. The main problem with this approach is to detect the connection points. The points where the paths come together (point A in Fig. 23.5) can be detected after a

280

M. Mendes et al.

reasonable number of images of path 1 have been retrieved, when the robot is learning path 2. When that happens, the robot stores the connection in its working memory and stops learning path 2. From that point onwards, it keeps monitoring if it is following the same path that it has learnt. After a reasonable number of predictions have failed, it adds another connection point to the graph and resumes learning the new path. In the tests with the SDM, a number of 3–5 consecutive images within the access radius usually sufficed to establish a connection point, and 3–5 images out of the access radius was a good indicator that the paths were diverging again.

23.5.3 Sequence Disambiguation One problem that arises when using navigation based on sequences is that of sequence disambiguation. Under normal circumstances, it is possible the occurrence of sequences such as (1) ABC; (2) XBZ; or (3) DEFEG, each capital letter representing a random input vector. There are two different problems with these three sequences: (1) and (2) both share one common element (B); and one element (E) occurs in two different positions of sequence (3). In the first case, the successor of B can be either C or Z. In the second case, the successor of E can be either F or G. The correct prediction depends on the history of the system. One possible solution relies on using a kind of short term memory. Kanerva proposes a solution in which the input to the SDM is not the last input Dt ; but the juxtaposition of the last k inputs fDt ; Dt1 . . .Dtk g: This technique is called folding, and k is the number of folds. The disadvantage is that it greatly increases the dimensionality of the input vector. Bose [1] uses an additional neural network, to store a measure of the context, instead of adding folds to the memory. In the present work, it seemed more appropriate a solution inspired by Jaeckel and Karlsson’s proposal of segmenting the addressing space [3]. Jaeckel and Karlsson propose to fix a certain number of coordinates when addressing, thus reducing the number of hard locations that can be selected. In the present work, the goal is to retrieve an image just within the sequence that is being followed. Hence, Jaeckel’s idea is appropriate for that purpose. The number of the sequence can be fixed, thus truncating the addressing space.

23.6 Experiments and Results For practical constraints, the experiments were performed in a small testbed in the laboratory. The testbed consisted of an arena surrounded by a realistic countryside scenario, or filled in with objects simulating a urban environment.

23

Topological Mapping Using Vision

281

23.6.1 Tests in an Arena Stimulating a Country Environment The first experiment performed consisted in analysing the behaviour of the navigation algorithm in the arena. The surrounding wall was printed with a composition of images of mountain views, as shown in Fig. 23.8. The field of view of the camera is relatively narrow (about 40°), so the robot cannot capture above or beyond the wall. Sometimes it can capture parts of the floor. Figure 23.6 shows an example of the results obtained. In the example, the robot was first taught paths L1 and L2. Then the memory was loaded with both sequences, establishing connection points A and B. The minimum overlapping images required for establishing a connection point was set to 3 consecutive images. The minimum number of different images necessary for splitting the paths at point B was also set to 3 consecutive images out of the access radius. The lines in Fig. 23.6 were drawn by a pen attached to the rear of the robot. Therefore, they represent the motion of the rear, not the centre of the robot, causing the arcs that appear when the robot changes direction. As the picture shows, the robot was able to start at the beginning of sequence L1 and finish at the end of sequence L2, and vice versa. Regardless of its starting point, at point A it always defaulted to the only known path L1. This explains the small arc that appears at point A in path F2. The arc represents an adjustment of the heading when the robot defaulted to path L1. The direction the robot takes at point B depends on the established goal. If the goal is to follow path L1, it continues along that path. If the goal is to follow path L2, it will disambiguate the predictions to retrieve only images from path L2. That behaviour explains the changes in direction that appear in the red line (F1) at point B. The arcs were drawn when the robot started at path L1, but with the goal of reaching the end of path L2.

Fig. 23.6 Results: paths taught and followed. The robot successfully switches from one path to another and node points A and B

282

M. Mendes et al.

Fig. 23.7 Typical city view, where the traffic turn is temporarily occluded by passing cars

Fig. 23.8 Paths learnt (blue and black) and followed, with small scenario changes. The robot plans correctly the routes and is immune to small changes in the reconstituted urban scenario

23.6.2 Tests in a Stimulated Urban Environment In a second experiment, the scenario was filled with images mimicking a typical city environment. Urban environments change very often. Ideally, the robot should learn one path in a urban environment but still be able to follow it in case there are small changes, up to an acceptable level. For example, Fig. 23.7 shows two pictures of a traffic turn, taken only a few seconds one after the other. Although the remaining scenario holds, one picture captures only the back of a car in background. The other picture captures a side view of another car in foreground. Due to the small dimensions of the robot, it was not tested in a real city environment, but in a reconstruction of it. Figure 23.8 shows the results. Figure 23.8a shows the first scenario, where the robot was taught. In that scenario the robot, during segment AB, is guided essentially by the image of the traffic turn

23

Topological Mapping Using Vision

283

without the car. In a second part of the same experiment, the picture of the traffic turn was replaced by the other picture with the car in foreground, and the robot was made to follow the same paths. Again, it had to start at path L1 and finish at path L2, and vice versa. As Fig. 23.8b shows, it was able to successfully complete the tasks.

23.7 Conclusions Navigation based on view sequences is still an open research question. In this paper, a novel method was proposed that can provide vision-based navigation based on a SDM. During a learning stage, the robot learns new paths. Connection points are established when two paths come together or split apart. That way, a topological representation of the space is built, which confers on the robot the ability to switch from one sequence to another and plan new paths. One drawback of this approach is that the SDM model, simulated in software as in this case, requires a lot of processing and is not fast to operate in real time if the number of images is very large. Another disadvantage is that using just front views, the robot only merges paths that come together in the same heading. That problem can be solved using metric information to calculate when the robot is in a place it has already been, even if with another heading. Another possibility is to use omnidirectional images. The results shown prove the feasibility of the approach. The robot was tested in two different environments: one that is a reconstitution of a country environment, the other a reconstitution of a changing urban environment. It was able to complete the tasks, even under changing conditions.

References 1. Bose J (2003) A scalable sparse distributed neural memory model. Master’s thesis, University of Manchester, Faculty of Science and Engineering, Manchester, UK 2. Ishiguro H, Tsuji S (1996) Image-based memory of environment. In: Proceedings of the IEEE/RSJ international conference on intelligent robots and systems 3. Jaeckel LA (1989) An alternative design for a sparse distributed memory. Technical report, Research Institute for Advanced Computer Science, NASA Ames Research Center 4. Johnson S (2004) Mind wide open. Scribner, New York 5. Kanerva P (1988) Sparse distributed memory. MIT Press, Cambridge 6. Matsumoto Y, Ikeda K, Inaba M, Inoue H (1999) Exploration and map acquisition for viewbased navigation in corridor environment. In: Proceedings of the international conference on field and service robotics, pp 341–346 7. Matsumoto Y, Inaba M, Inoue H (2000) View-based approach to robot navigation. In: Proceedings of 2000 IEEE/RSJ international conference on intelligent robots and systems (IROS 2000) 8. Matsumoto Y, Inaba M, Inoue H (2003) View-based navigation using an omniview sequence in a corridor environment. In: Machine vision and applications

284

M. Mendes et al.

9. Mendes M, Paulo Coimbra A, Crisóstomo MM (2010) Path planning for robot navigation using view sequences. In: Lecture notes in engineering and computer science: proceedings of the World Congress on engineering 2010, WCE 2010, London, UK 10. Mendes M, Crisóstomo MM, Paulo Coimbra A (2008) Robot navigation using a sparse distributed memory. In: Proceedings of the 2008 IEEE international conference on robotics and automation, Pasadena, CA, USA 11. Mendes M, Crisóstomo MM, Paulo Coimbra A (2009) Assessing a sparse distributed memory using different encoding methods. In: Proceedings of the 2009 international conference of computational intelligence and intelligent systems, London, UK 12. Meyer J (2003) Map-based navigation in mobile robots: Ii. A review of map-learning and path-planning strategies. Cogn Syst Res 4(4):283–317 13. Rasmussen C, Hager GD (1996) Robot navigation using image sequences. In: Proceedings of AAAI, pp 938–943 14. Ratitch B, Precup D (2004) Sparse distributed memories for on-line value-based reinforcement learning. In: ECML 15. Winters N, Santos-Victor J (1999) Mobile robot navigation using omni-directional vision. In: Proceedings of the 3rd Irish machine vision and image processing conference (IMVIP’99), pp 151–166

Chapter 24

A Novel Approach for Combining Genetic and Simulated Annealing Algorithms Younis R. Elhaddad and Omar Sallabi

Abstract The Traveling Salesman Problem (TSP) is the most well-known NP-hard problem and is used as a test bed to check the efficacy of any combinatorial optimization methods. There are no polynomial time algorithms known that can solve it, since all known algorithms for NP-complete problems require time that is excessive to the problem size. One feature of Artificial Intelligence (AI) concerning problems is that it does not respond to algorithmic solutions. This creates the dependence on a heuristic search as an AI problem-solving technique. There are numerous examples of these techniques such as Genetic Algorithms (GA), Evolution Strategies (ES), Simulated Annealing (SA), Ant Colony Optimization (ACO), Particle Swarm Optimizers (PSO) and others, which can be used to solve large-scale optimization problems. But some of them are time consuming, while others could not find the optimal solution. Because of this many researchers thought of combining two or more algorithms in order to improve solutions quality and reduce execution time. In this work new operations and techniques are used to improve the performance of GA [1], and then combine this improved GA with SA to implement a hybrid algorithm (HGSAA) to solve TSP. This hybrid algorithm was tested using known instances from TSPLIB (library of sample instances for the TSP at the internet), and the results are compared against some recent related works. The comparison clearly shows that the HGSAA is effective in terms of results and time.

Y. R. Elhaddad (&) O. Sallabi Faculty of Information Technology, Garyounis University, P.O. 1308, Benghazi, Libya e-mail: [email protected] O. Sallabi e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_24, Springer Science+Business Media B.V. 2011

285

286

Y. R. Elhaddad and O. Sallabi

24.1 Introduction Many problems of practical and theoretical importance within the fields of artificial intelligence and operations research are of a combinatorial nature. In these problems, there is a finite solution set X and a real-valued function f: X ? R whereby the goal is to search for a solution x* [ X with f(x*) B f(x) V x [ X. The goal of an optimization problem can be formulated as follows: rearrange control or decision variables according to some constraints in order to minimize or maximize the value of an objective function [2]. The most widely known and famous example of a combinatorial optimization problem is the Traveling Salesman Problem (TSP) [2–4]. Problem-solving is an area of Artificial Intelligence (AI) that is concerned with finding or constructing the solution to a difficult problem like combinatorial optimization problems, using AI algorithms such as Genetic Algorithms (GA), Simulated Annealing (SA), Ant Colony Optimization (ACO), Particle Swarm Optimizers (PSO), Iterated Local Search (ILS), Tabu Search (TS), and others. These can be used to solve large-scale optimization problems. But some of them are time-consuming and others could not find the optimal solution because of the time constraints. Thus many researchers thought of combining two or more algorithms in order to improve solution quality and reduce execution time. In this work, new techniques and operations are applied to GA in order to improve its performance. Then this improved GA is combined with SA, using a new approach of this combination that produces a new Hybrid Genetic and Simulated Annealing Algorithm (HGSAA). The proposed algorithm was tested using symmetric TSP instances from known TSPLIB [5], and the results show that the algorithm is able to find an optimal solution or near optimal solution for varying sizes of these instances.

24.2 The Travelling Salesman Problem Travelling Salesman Problem (TSP) is a classic case of a combinatorial optimization problem and is one of the most widely known Non deterministic Polynomial (NP-hard) problems [3]. The travelling salesman problem is stated as follows: given a number of cities with associated city to city distances, what is the shortest round trip tour that visits each city exactly once and then returns to the start city [6]. The TSP can be also stated as, given a complete graph, G, with a set of vertices, V, a set of edges, E, and a cost, cij associated with each edge in E, where cij is the cost incurred when traversing from vertex i 2 V to vertex j 2 V, a solution to the TSP must return the minimum distance Hamiltonian cycle of G. A Hamiltonian cycle is a cycle that visits each node in a graph exactly once and returns to the starting node. This is referred to as a tour in TSP terms. The real problem is to decide in which order to visit the nodes. While easy to explain, this problem is not always easy to solve. There are no known polynomial time algorithms that can solve TSP. Therefore it is classified as an NP-hard problem. The TSP became

24

A Novel Approach for Combining Genetic and SA Algorithms

287

Fig. 24.1 Comparison of rate conversion for IGA and HGSAA

popular at the same time the new subject of linear programming arose along with challenges of solving combinatorial problems. The TSP expresses all the characteristics of combinatorial optimization, so it is used to check the efficacy of any combinatorial optimization method and is often the first problem researchers use to test a new optimization technique [7]. Different types of TSP can be identified by the properties of the cost matrix. The repository, TSPLIB, which is located at [5], contains many different types of TSP, and related problems. This thesis deals with symmetric TSP of type ECU_2D, where in symmetric (STSP) cij ¼ cji 8i; j; otherwise this set of problems is referred to as asymmetric (ATSP). The data of STSP given at TSPLIB contains the problem name (almost the name followed by the number of cities in the problem, e.g. kroA100, and rd100 both contain 100 cities in the problems). The data also provides the user with an array n 3 where n is the number of cities and the first column is the index of each city. Columns two and three are the positions of the city on the x-axis and the y-axis. Assuming that each city in a tour is marked by its position (xi, yi) in the plane (see Fig. 24.1), and the cost matrix c contains the Euclidean distances between the ith and jth city: qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ð24:1Þ cij ¼ ðxi xj Þ2 þ ðyi yj Þ2

288

Y. R. Elhaddad and O. Sallabi

The objective of TSP is to minimize the function f, where f ¼

n1 X

ci;iþ1 þ c1;n

ð24:2Þ

i¼1

The search space of a Euclidean TSP of N cities contains N! permutations. The objective is to find a permutation of the N cities that has minimum cost. For a symmetric problem with n cities there are ðn 1Þ!=2 possible tours.

24.3 Genetic Algorithm Evolutionary computation (EC) is based on biological evolution processes of living organisms, according to evolution theory of natural selection and survival of the fittest. EC consists of a population of individuals (solutions for a problem), performing iteratively. Operations such as reproduction, recombination, mutation and selection, result in the ‘‘survival of the fittest,’’ or the best solution occurring in the population of solutions. Genetic algorithms (GAs) are a specific type of Evolutionary Algorithm (EA). GAs will be the center of attention appearing to be the best suited evolutionary algorithms for combinatorial optimization problems. The power of GAs comes from their reliable, robust optimization method and applicability to a variety of complex problems. In general GAs can be described as follows: Genetic algorithms start with generating random populations of possible solutions. Each individual of the population is represented (coded) by a DNA string, called a chromosome, and the chromosome contains a string of problem parameters. Individuals from the population are selected based on their fitness values. The selected parents are recombined to form a new generation. This process is repeated until some termination condition is met.

24.4 Simulated Annealing The purpose of physical annealing is to accomplish a low energy state of a solid. This is achieved by melting the solid in a heat bath and gradually lowering the temperature in order to allow the particles of the solid to rearrange themselves in a crystalline lattice structure. This structure corresponds to a minimum energy state for the solid. The initial temperature of the annealing process is the point at which all particles of the solid are randomly arranged within the heat bath. At each temperature, the solid must reach what is known as thermal equilibrium before the cooling can continue [8]. If the temperature is reduced before thermal equilibrium is achieved, a defect will be frozen into the lattice structure and the resulting crystal will not correspond to a minimum energy state.

24

A Novel Approach for Combining Genetic and SA Algorithms

289

The Metropolis Monte Carlo simulation can be used to simulate the annealing method at a fixed temperature T. The Metropolis method randomly generates a sequence of states for the solid at the given temperature. A solid’s state is characterized by the positions of its particles. A new state is generated by small movements of randomly chosen particles. The change in energy DE caused by the move is calculated and acceptance or rejection of the new state as the next state in the sequence is determined according to Metropolis acceptance condition. If DE\0 the move is acceptable and if DE [ 0 the move is acceptable with probDE ability, if e t [ X . The move is acceptable otherwise rejected, where X is random number and 0\X\1. Simulated annealing algorithms have been applied to solve numerous combinatorial optimization problems. The name and idea of SA comes from annealing in metallurgy, a technique involving heating and controlled cooling of a material to increase the size of its crystals and reduce their defects. The heat frees the atoms to move from their initial positions (initial energy). By slowly cooling the atoms the material continuously rearranges, moving toward a lower energy level. They gradually lose mobility due to the cooling, and as the temperature is reduced the atoms tend to crystallize into a solid. In the simulated annealing method, each solution s in the search space is equivalent to a state of a physical system and the function f(s) to be minimized is equivalent to the internal energy of that state. The objective is to minimize the internal energy as much as possible. For successful annealing it is important to use a good annealing schedule, reducing the temperature gradually. The SA starts from a random solution xp , selects a neighboring solution xn and computes the difference in the objective function values, Df ¼ f ðxn Þ f xp . If the objective function is improved (Df \0), then the present solution xp is replaced by the new one xn; otherwise the solution that decreases the value of the objective function with a probability DE pr ¼ 1=ð1 þ e t Þ is accepted, where pr is decreased as the algorithm progresses, and where (t) is the temperature or control parameter. This acceptance is achieved by generating a random number (rnÞ where ð0 rn 1Þ and comparing it against the threshold. If pr [ rn then the current solution is replaced by the new one. The procedure is repeated until a termination condition is satisfied.

24.5 Improved Genetic Algorithm Technique Crossover is the most important operation of GA. This is because in this operation characteristics are exchanged between the individuals of the population. Accordingly (IGA) is concerned with this operation more than population size, thus the initial population consists of only two individuals, applying Population Reformulates Operation (PRO). Multi-crossovers are applied to these individuals to produce 100 children with different characteristics inherited from their parents, making ten copies of these children. Multi-mutation is applied, where each copy mutates with each method, evaluating the fitness function for each individual,

290

Y. R. Elhaddad and O. Sallabi

selects the best two individuals, and then finally applies the Partial Local Optimal (PLO) mutation operation to the next generation. In the technique used for IGA the tour was divided into three parts with two randomly selected cut points (p1 and p2 ). The head contains ð1; 2; . . .; p1 1Þ, the middle contains ðp1 ; p1 þ 1; . . .; p2 Þ, and the tail contains (p2 þ 1; p2 þ 2; . . .; nÞ. Using multi-crossover the head of the first parent is changed with the tail of the second parent. The middle remains unchanged, until partial local optimal mutation operation is applied which improves the middle tour by finding its local minima. The role of population reformulates operation is to change the structure of the tour by changing the head and the tail with the middle. In this technique the procedure ensures that new cities will be at the middle part of each cycle ready for improvement.

24.5.1 Multi-Crossover Operation Crossover is the most important operation of GA because it exchanges characteristics between the individuals, and according to that many types of crossover operations are used to produce offspring with different attributes in order to build up an overall view of the search space. Multi-crossover works as mentioned below. The basic principle of this crossover is two random cut points (p1 and p2 ), a head, containing ð1; 2; . . .; p1 1Þ, the middle containing ðp1 ; p1 þ 1; . . .; p2 Þ, and the tail containing (p2 þ 1; p2 þ 2; . . .; nÞ. The head and tail of each parent are flipped, and then the head of the first parent is swapped with the tail of the other parent, and vice versa. For example, if the selected random two crossover points are p1 ¼ 4 and p2 ¼ 7, and two parents tours are: head1

mid1

tail2

head2

mid2

tail1

zﬄﬄﬄﬄﬄ}|ﬄﬄﬄﬄﬄ{ zﬄﬄﬄﬄﬄﬄﬄﬄﬄ}|ﬄﬄﬄﬄﬄﬄﬄﬄﬄ{ zﬄﬄﬄﬄﬄﬄﬄ}|ﬄﬄﬄﬄﬄﬄﬄ{ Parent1 ! 9 1 5 7 4 8 6 2 10 3 zﬄﬄﬄﬄﬄ}|ﬄﬄﬄﬄﬄ{ zﬄﬄﬄﬄﬄﬄﬄﬄﬄ}|ﬄﬄﬄﬄﬄﬄﬄﬄﬄ{ zﬄﬄﬄﬄﬄﬄﬄ}|ﬄﬄﬄﬄﬄﬄﬄ{ Parent2 ! 2 8 5 6 3 1 4 7 10 9 For a valid tour the elements of head2 and tail2 are removed from the parent1 to give mid1 mid1

zﬄﬄﬄﬄﬄﬄﬄﬄﬄ}|ﬄﬄﬄﬄﬄﬄﬄﬄﬄ{ 1 4 6 3 In the same way, elements of head1 and tail1 are removed from the parent2 to give mid2 mid2

zﬄﬄﬄﬄﬄﬄﬄﬄﬄ}|ﬄﬄﬄﬄﬄﬄﬄﬄﬄ{ 8 6 4 7

24

A Novel Approach for Combining Genetic and SA Algorithms

291

Step 1 If the parts (head2, mid1, tail2) are reconnected using all possible permutations, six different children can be obtained (3!). child1 ! 2

8 5

1

4

6 3

7

10

9

In the same way for (head1, mid2, tail1), six other children are produced: i.e. child2 ! 9

1 5

8

6

4 7

2

10

3

Step 2 If the two heads are flipped, as in step 1, 12 new different children are produced: child3 ! 5

8 2

1

4

6 3

7

10

9

child4 ! 5

1 9

8

6

4 7

2

10

3

Step 3 If the two tails are flipped and as in step 1, 12 new different children are produced: child5 ! 2

8 5

1

4

6 3

9

10

7

child6 ! 9

1 5

8

6

4 7

3

10

2

Step 4 If the two mid are flipped and as in step 1; 12 new different children are produced: child7 ! 2

8 5

3

6

4 1

7

10

9

child8 ! 9

1 5

7

4

6 8

2

10

3

Step 5 If the two heads and tails are flipped and as in step 1, 12 new different children are produced: child9 ! 5

8 2

1

4

6 3

9

10

7

child10 ! 5

1 9

8

6

4 7

3

10

2

In each step 12 children are produced; therefore 5 ð3!Þ 2 ¼ 60 completely different children are produced from just two parents.

24.5.2 Selection Operation Using the rank selection, the best two individuals are selected for the next operations in order to reduce the execution time.

24.5.3 Mutation The inversion mutation operation is used here, where random subtour is selected from the second individual then is inversed.

292

Y. R. Elhaddad and O. Sallabi

24.5.4 The Rearrangement Operation This operation is applied to both individuals. ci;j is the cost between the two adjacent cities cityi and cityj , where i ¼ 1; 2; 3; . . .; n 1 and j ¼ i þ 1. The aim of this operation is to find the greatest (max) value of ci;j among all the adjacent cities on the tour, and then swap cityi with three other cities, one at a time. These cities are located on three different positions on the tour (beginning, middle, and end). The best position, as well as the original position will be accepted. This operation works in a random matter, and while it may not achieve any improvement after several iterations, it might instead (or is just as likely to) take a big jump and improve the result.

24.5.5 Partial Local Optimal Mutation Operation In this operation, the subtour of individuals is selected randomly within the range of 3 B size of subtour \ n/4. We then find the tour that produces the local minima of this subtour and exchange it with the original subtour. This operation is undertaken on one of the selected individuals after the mutation operation is performed.

24.6 The Proposed Hybrid Algorithm (HGSAA) The proposed HGSAA is designed by combining the IGA and SA in order to reap the benefits of SA and reduce the time that IGA spends stuck at local minima. Initial temperature of SA is set at a small value, 80, because the number of cycles SA will perform is only ten cycles. Thus this temperature will ensure that SA can reach the state of equilibrium within these cycles. The hybrid algorithm starts with a random population. It will use the input of the GA, and multi-crossover is then applied to produce 60 different children. The parents’ and their offspring’s fitness will be calculated and depending on the results of this calculation a new population will be selected that is the same size as the original population. A partial local optimal mutation operation will then be applied to one individual (according to mutation probability) in order to improve its fitness value. The rearrangement operation is also used on the population. This process is continued until there is no improvement in the results after ten consecutive iterations. The memorized population from GA which provides the best result will then be transferred to the SA. The SA processes will be used to improve the results by using the nearest solution technique. If results are no longer improved within ten consecutive iterations, then the best memorized population from the SA will be moved to the GA to repeat the above process. Figure 24.1 shows the conversion rate of HGSAA and IGA for the

24

A Novel Approach for Combining Genetic and SA Algorithms

293

Table 24.1 Results of HGSAA Problem Optimal Best result

Iteration

Time sec.

Average

St. dev.

Error (%)

eil101 ch130 ch150 korA100 kroA150 kroA200

400 500 750 (292) 400 (171) 800 (407) 1100

17 (15) 26 46 (18) 18 (7) 53 (27) 85

632.9 6146.7 6540.4 21319.8 26588.7 29434.9

2.8 14.8 13.9 32.5 62.3 45.7

0 0.6 0 0 0 0.23

629 6110 6528 21282 26524 29368

629 6126 6528 21282 26524 29382

dsj1000 problem from TSLIB [5]. Unlike the curve of HGSAA the curve of the IGA is stuck and the result is steady at many positions of the curve during its process. In other words in HGSAA the SA is causing the algorithm to be stuck for a long time and improves the results faster than the GA does.

24.7 Experimental Results of HGSAA The following sections will discuss the results of experiments and compare them with some recently related work which used hybrid genetic algorithms to solve TSP.

24.7.1 Comparison with LSHGA Instances that are 100 cities from TSPLIB [5] and used by Zhang and Tong [9] are used. The same number of generations for each instance is used in order to compare the results of HGSAA and the local search heuristic genetic algorithms LSHGA [9]. The HGSAA was run for ten trials corresponding to each instance, and the summarized results are shown in Table 24.1 where column 2 shows the known optimal solutions; column 3 shows the best result obtained by the HGSAA; column 4 indicates the number of generations performed, with the number of generations needed to obtain the optimal result in parentheses; column 5 indicates the time in seconds used for each instance, with the time to obtain the optimal result in parentheses; column 6 shows the average of the ten results for each instance; column 7 shows the standard deviation of the ten results for each instance; and column 8 shows the error ratio between the best result and the optimal, which is calculated according to Eq. 24.3. The results of LSHGA are summarized in Table 24.2 The notations, PS, CN, OS and error, denotes the population size of the algorithm, the convergence iteration number, the best solution of the LSHGA, and the error respectively. Errors are calculated according to Eq. 24.3.

294

Y. R. Elhaddad and O. Sallabi

Table 24.2 Results of LSHGA Problem PS

CN

BS

Error (%)

eil101 ch130 ch150 korA100 kroA150 kroA200

400 500 750 400 800 1100

640 6164 6606 21296 26775 29843

1.75 0.88 1.19 0.66 0.95 1.62

300 350 400 300 450 500

Table 24.3 Results of HGSAA and HGA Instance Optimal

HGSAA

HGA35

Eil51 Eil76 Eil101 KroA100 KroD100 D198 kroA200

426 (428.98) 538 (544.37) 629 (640.2116) 21282 21294 15781 29368

426 (428.87) 538 (544.37) 629 (640.975)* 21282 21306 15788 29368

426 538 629 21282 21294 15780 29368

Error ¼

average optimal 100: optimal

ð24:3Þ

From Tables 24.1 and 24.2 it is clear that the HGSAA performed better than the LSHGA. The HGSAA can find the optimal solution for four instances out of six, while LSHGA cannot find an optimal solution for any of the six instances. The error ratios in both tables indicate that the HGSAA performs much better than the LSHGA.

24.7.2 Comparison with HGA The HGSAA has been compared to the HGA proposed by Andal Jayalakshmi et al. [10]. The HGSAA runs seven known instances of TSPLIB [5], ten trails for each one, same as the work done at [10]. The HGSAA used the integer and real tours eil51, eil76, and eil101. In Table 24.3 column 2 shows the known optimal solutions, column 3 shows the best result obtained by the HGSAA, the real number is in parenthesis; and column 4 indicates the best result of HGA from [10]. The comparison of the results summarized in table [3] shows that HGSAA obtained better results than the HGA. For real tours, for instance eil101, a new best result is obtained by HGSAA, where formerly the best known result was reported by [10].

24

A Novel Approach for Combining Genetic and SA Algorithms

Table 24.4 Results of HGSAA and SAGA Problem HGSAA dsj1000 d1291 fl1400 fl1577 pr2392

295

SAGA

Avg.

Std. dev.

Avg.

Std. dev.

1.72 1.91 0.43 0.92 6.37

0.19 0.596 0.38 0.41 0.36

2.27 3.12 0.64 0.64 6.53

0.39 1.12 0.55 0.55 0.56

24.7.3 Comparison with SAGA Stephen Chen and Gregory Pitt [11] proposed hybrid algorithms of SA and GA and they used large scale TSP. All of these instances were larger than 1,000 cities. Table 24.4 shows the average error from the optimal solution for each instance and the standard deviation of both HGSAA and SAGA. The termination condition for the HGSAA is set to be 7200 s for all except both fl1577 and pr2392 problems where the time for both is set to be 10800 s.

24.8 Conclusion and Future Work As a scope of future work, possible directions can be summarized in the following points: • To assess the proposed HGSAA, more empirical experiments may be needed for further evaluation of the algorithm. The announced comments may increase the effectiveness of the algorithm, thus should be discussed and taken into consideration. • The data structures of the HGSAA algorithm can be refined. Therefore the execution time may be further reduced. • Genetic Algorithms can be hybridized with another heuristic technique for further improvement of the results. • The presented algorithm can be used to solve different combinational problems such as DNA sequencing.

References 1. Elhaddad Y, Sallabi O (2010) A new hybrid genetic and simulated annealing algorithm to solve the traveling salesman problem. In: Lecture notes in engineering and computer science: proceedings of the world congress on engineering 2010, vol I WCE 2010, June 30–July 2, London, UK, pp 11–14

296

Y. R. Elhaddad and O. Sallabi

2. Lawle EL (1976) Combinatorial optimization: networks and matroids. Holt, Rinehart, and Winston, New York 3. Larranaga P, Kuijpers CM, Murga RH, Inza I, Dizdarevic S (1999) Genetic algorithms for the travelling salesman problem: a review of representations and operators. CiteSeerX. citeseer.ist.psu.edu/318951.html. Accessed Nov 19, 2007 4. Fredman M et al (1995) Data structures for traveling salesmen. AT&T labs—research. www.research.att.com/*dsj/papers/DTSP.ps. Accessed Feb 13, 2008 5. Heidelberg University. http://www.iwr.uni-heidelberg.de/groups/comopt/software/TSPLIB95/. Accessed Jan 22, 2007 6. Mitchell G, O’Donoghue D, Trenaman A (2000) A new operator for efficient evolutionary solutions to the travelling salesman problem. LANIA. www.lania.mx/*ccoello/mitchell00. ps.gz. Accessed Aug 22, 2007 7. Bhatia K (1994) Genetic algorithms and the traveling salesman problem. CiteSeer. http://citeseer.comp.nus.edu.sg/366188.html. Accessed Feb 26, 2008 8. Metropolis N et al (1953) Equation of state calculations by fast computing machines. Florida State University. www.csit.fsu.edu/*beerli/mcmc/metropolis-et-al-1953.pdf. Accessed Feb 17, 2008 9. Zhang J, Tong C (2008) Solving TSP with novel local search heuristic genetic algorithms. IEEE_explore. http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=4666929&isnumber=4666792. Accessed Jan 12, 2009 10. Jayalakshmi G, Sathiamoorthy S, Rajaram R (2001) A hybrid genetic algorithm—a new approach to solve traveling salesman problem. CiteSeer. http://citeseerx.ist.psu.edu/viewdoc/ summary?doi=10.1.1.2.3692. Accessed Jan 14, 2008 11. Chen S, Pitt G (2005) Isolating the benefits of respect. York University. http://www. atkinson.yorku.ca/*sychen/research/papers/GECCO-05_full.pdf. Accessed Jan 5, 2009

Chapter 25

Buyer Coalition Formation with Bundle of Items by Ant Colony Optimization Anon Sukstrienwong

Abstract In electronic marketplaces, there are several buyer coalition schemes with the aim of obtaining the best discount and the total group’s utility for buying a large volume of products. However, there are a few schemes focusing on a group buying with bundles of items. This paper presents an approach called GroupBuyACO for forming buyer coalition with bundle of items via the ant colony optimization (ACO). The concentration of the proposed algorithm is to find the best formation of the heterogeneous preference of buyers for earning the best discount from venders. The buyer coalition is formed concerning the bundles of items, item price, and the buyer reservations. The simulation of the proposed algorithm is evaluated and compared with the GAGroupBuyer scheme by Sukstrienwong (Buyer formation with bundle of items in e-marketplaces by genetic algorithm. Lecture note in engineering and computer science: proceedings of the international multiconference of engineers and computer scientists 2010, IMECS 2010, 17–19 March 2010, Hong Kong, pp 158–162). Experimental Results indicate that the algorithm can improve the total discount of any coalitions.

25.1 Introduction At present, an electronic commerce is becoming a necessary tool for many companies to sell their products because it is one of the fastest ways to advertise the product’s information to the huge number of customers. Tons of products can

A. Sukstrienwong (&) Information Technology Department, School of Science and Technology, Bangkok University, Bangkok, Thailand e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_25, Springer Science+Business Media B.V. 2011

297

298

A. Sukstrienwong

be sold rapidly in few days. So, the companies can get better profits from selling a large number of products. Ordinarily, many sellers provide some attractive products with the special prices. One of the strategies which sellers prefer to make is selling their goods in bundles of item1 with the special prices. Moreover, several commercial websites such as http://www.buy.yahoo.com.tw/ and https://www. shops.godaddy.com/ usually offer the volume discount for customers if the number of selling is big. For buyer side, most of the buyers prefer to build the corresponding purchasing strategies to reduce the amount of purchase cost. For this reason, the buyer strategy becoming rapidly popular on the Internet is a buyer coalition formation because buyers can improve their bargaining power and negotiate more advantageously with sellers to purchase goods at a lower price. In the recent years, several existing buyer coalition schemes in electronic marketplaces have been developed. The main objective of these schemes is to gather all buyers’ information for forming a buyer coalition to purchase goods at low cost. It helps to reduce the cost of communication and makes buyers comfortable in joining a coalition. The work of Ito et al. [10] presented an agent-mediated electronic market by group buy scheme. Buyers or sellers can sequentially enter into the market to make their decisions. The work of Tsvetovat et al. [18] has investigated the use of incentives to create buying group. Yamamoto and Sycara [20] presented the GroupBuyAction scheme for forming buyer coalition based on item categories. Then, the paper of Hyodo et al. [8] presented an optimal coalition formation among buyer agents based on genetic algorithms (GAs) with the purpose of distributing buyers among group-buying site optimally to get good utilities. The Combinatorial Coalition Formation scheme Li and Sycara [13] considers an e-marketplace where sellers set special offers based on volume. And, buyers place a bid on a combination of items with the reservation prices which is the maximum price that a buyer is willing to pay for an item of goods. In the work of Mahdi [14], GAs are applied for negotiating intelligent agents in electronic commerce using a simplified standard protocol. However, there are few schemes such as GroupPackageString scheme by Sukstrienwong [16] and GroupBuyPackage scheme by Laor et al. [11] that focused on a buyer coalition with bundles of items. Only the GroupPackageString scheme applied by using GAs to forms the buyer coalition with bundles of items. In the corresponding conference paper, Sukstrienwong [17], to this paper, further results are found. The proposed approach applies ACO technique for forming buyer coalitions with the aim at maximizing the total discount. The paper is divided into five sections, including this introduction section. The rest of the paper is organized as follow. Section 25.2 outlines group buying with bundle of items and the motivating problem. Section 25.3 presents the basic concept of ACO and problem formulization to buyer formation with bundles of items. The experimental results of the simulation of the GroupBuyACO algorithm are in Sect. 25.4. The conclusions and future works are in last section.

1

Bundle of items in the work of Gurler et al. [6] refers to the practice of selling two or more goods together in a package at a price which is below the sum of the independent prices.

25

Buyer Coalition Formation with Bundle of Items by ACO

299

Table 25.1 An example of price lists Sellers Package numbers Product types s0

s1

s3

package11 package12 package13 package14 package21 package22 package23 package24 package31 package32 package33

Price ($)

Toilet paper

Paper tower

Lotion

Detergent

pack of 1 – – Pack of 1 – – – – – – –

– pack of 1 pack of 3 Pack of 6 – – – – pack of 1 – pack of 3

– – – – pack of 1 – – pack of 1 – – –

– – – – pack pack pack – pack pack

of 1 of 4 of 8 of 1 of 1

8.9 14.0 32.5 50.9 10.5 19.0 67.0 92.0 14.0 19.0 49.5

25.2 Outline the Group Buying with Bundle of Items In electronic marketplaces, sellers have more opportunity to sell their products in a large number if their websites are very well-known among buyers. Moreover, the pricing strategy is one of the tools for sellers that might expedite the selling volume. Some sellers simultaneously make a single take-it-or-leave-it price offer to each unassigned buyer and to each buyer group defined by Dana [2]. In this paper, I assume that the buyer group is formed under one goal to maximizing aggregate buyer’s utility, the price discount received by being members of a coalition. Additionally, the definition of bundles of items is a slightly difference from the work of Gurler et al. [6]; in this paper, it refers to several items together in a package of one or more goods at one price. The discount policy of sellers based on the number of items bundled in the package. If the package is pure bundling, the average price of each item will be cheaper than the price of a single-item package. Suppose three sellers are in the e-marketplace selling some similar or the same products. Sellers prepare a large stock of goods and show the price list for each product. In this paper, I assume the agents are self-automate and be able to form coalitions when such a choice is beneficial. The example of three sellers’ information is shown in Table 25.1. First seller, called s1, is selling two sizes of facial toner, 100 and 200 cc. To get buyer attraction the seller s1 has made the special offers. The Seller s1 offers a package of number p13 with the price of $32.0. The package p13 composes of three bottle of facial toner (200 cc). The average price of each facial toner (200 cc) is about 32.0/3 = 10.67 dollars/bottle which is 14.0 - 10.67 = 3.33 dollars/bottle cheaper than a sing-bottle of facial toner (200 cc) in package p12 . At the same time, the third Seller called s3 offers package p33 which comprises of three bottles of facial toner (200 cc) and 1 bottle of body lotion (250 cc) at the price of $49.5. However, a single bottle of facial toner (200 cc) and body lotion (250 cc) are set individually in the package p12 at the price

300

A. Sukstrienwong

Table 25.2 An example of buyer’s orders with the reservation price bBuyers Buyer’s order (number of items reservation prices $) Facial toner

1

b2 b3 b4

Body lotion

100 cc

200 cc

1,500 cc

250 cc

– – – 1 9 (8.0)

1 9 (9.0) – – 4 9 (11.0)

– – 3 9 (6.0) –

1 9 (10.5) 2 9 (10.95) 1 9 (6.0) –

of $14.0 and the package p22 at the price of $19.0. Suppose there are some buyers who want to purchase some products listed in the Table 25.1. In the heterogeneous preference of buyers, some buyers do not want to purchase the whole bundle of items by their own. Buyers only need to buy a few items of products. Suppose a buyer called b1 who wants to purchase a bottle of facial toner (200 cc) and a bottle of body lotion (250 cc) as shown in Table 25.2. Typically, buyers have seen the price lists provided by all sellers before making their orders. The problem of buyer b1 is described as follows. If buyer b1 goes straight to purchase those products by his own, the total cost that buyer b1 needs to pay is 14.0 ? 19.0 = 33.0 dollars which is the highest price at that time. So, the buyer b1 comes to participate in the group buying with the aim of obtaining better prices on the purchasing. Then, buyer b1 places the orders to specific items with the reservation prices of $9.0 for facial toner (200 cc) and $10.5 a bottle of body lotion (250 cc).

25.3 Ant Colony Optimization for Buyer Coalition with Bundles of Items 25.3.1 The Basic Concept of ACO The algorithm is based on an imitation of the foraging behavior of real ants as described in the work of Goss et al. [5]. Ant colony optimization (ACO) algorithms are inspired by the behavior of real ants for finding good solutions to combinatorial optimization. The first ACO algorithm was introduced by Dorigo and Gambardella [3] and Dorigo and Di Caro [4] which known as ant system (AS). ACO have applied to classical NP-hard combinatorial optimization problems, such as the traveling salesman problem in the work of Lawler et al. [12], the quadratic assignment problem (QAP) by Maniezzo et al. [15], the shop scheduling problem, and mixed shop scheduling by Yamada and Reeves [19]. The application of ACO appears in various fields. In the work of Ismail et al. [9], this paper presents the economic power dispatch problems solved using ACO technique. And, Alipour et al. [1] has proposed an algorithm based on ACO to enhance the quality of final fuzzy classification system.

25

Buyer Coalition Formation with Bundle of Items by ACO

301

In nature, real ants are capable of finding the shortest path from a food source to their nest without using visual cues shown by Hölldobler and Wilson [7]. In ACO, a number of artificial ants build solutions to an optimization problem while updating pheromone information on its visited tail. Each artificial ant builds a feasible solution by repeatedly applying a stochastic greedy rule. While constructing its tour, an ant deposits a substance called pheromone on the ground and follows the path by previously pheromone deposited by other ants. Once all ants have completed their tours, the ant which found the best solution deposits the amount of pheromone on the tour according to the pheromone trail update rule. The best solution found so far in the current iteration is used to update the pheromone information. The pheromone sij , associated with the edge joining i and j, is updated as follow: sij

ð1 qÞ sij þ

m X

Dskij ; ;

ð25:1Þ

k¼1

where q is the evaporation rate which q [ (0,1] the reason for this is that old pheromone should not have too strong an influence on the future. And Dskij is the amount of pheromone laid on edge (i, j) by an ant k: Q=Lk if edgeði; jÞ is used by the ant k k ð25:2Þ Dsij ¼ 0 otherwise; where Q is a constant, and Lk is the length of the tour performed by the ant k. In constructing a solution, it starts from the starting city to visit an unvisited city. When being at the city i, the ant k selects the city j to visit through a stochastic mechanism with a probability pkij given by: 8 b < P saij gij if j 2 Nkl sa g b ð25:3Þ pkij ¼ cij 2N k il il l : 0 otherwise; where Nki is a set of feasible neighborhood of ant k, representing the set of cities where the ant k has not been visited. a and b are two parameters which determine the relative influence of pheromone trail and heuristic information, and gij , which is given by gij ¼

1 ; dij

ð25:4Þ

where dij is the length of the tour performed by ant k between cities i and j.

25.3.2 Problem Formalization There is a set of sellers on the Internet called S = {s1, s2, …, sm} offering to sell a partial or all goods of G = {g1, g2, …, gj}. Let B = {b1, b2, …, bn} denoted the

302

A. Sukstrienwong

collection of buyers. Each buyer wants to purchase several items posted by some sellers in S. The seller i has made special offers within a set of packages, denoted as PACKAGEi ¼ fpackagei1 ; packagei2 ; . . .; packageik g. The average price of goods per item is a monotonically decreasing function when the size of the package is increasing big. A PACKAGEi is associated with the set of prices, denoted PRICEi ¼f pricei1 ; pricei2 ; ..., priceik g , where priceik is the price of packageik which i;k i;k is the combination of several items defined as packageik ¼ fgi;k 1 ; g2 ; . . .; gj g, i;k i;k i gj;k j 0. If any goods gj is not bundled in the packagek , then gj ¼ 0. Additionally, the product price of any seller, called sm, is a function of purchased quantity, denoted pm(q), where q is the quantity of the product. The product price function is a monotonically decreasing function, dpm ðqÞ=dq\0. If a buyer called bm needs to buy some particular items offered by sellers in S, the buyer bm places the order denoted as m m m Qm ¼ fqm 1 ; q2 ; . . .; qj g, where qj is the quantity of items gj requested by the buyer m bm. If qj ¼ 0, it implies that the buyer bm does have no request to purchase goods gj. Additionally, the buyer bm must put his reservation price for each goods associated m m m with Qm, denoted as RSm ¼ frsm 1 ; rs2 ; . . .; rsj g where rsh 0; 0 h j. In this m paper, I assume that all of buyer reservation prices rsh of each item are higher than or equal to the minimum price sold by sellers. The objective of the problem is to find best utility of the coalition; the following terms and algorithm processes are needed to define. The coalition is a temporary alliance of buyers for a purpose of obtaining best utility. The utility of the buyer bm gained from buying qm d items of gd at the priced m m is rsd priced qd . The total utility of the buyer bm is j X

m ðrsm d priced Þqd :

ð25:5Þ

d¼1

Then the total utility of the group is defined as follow: U¼

j XX

ðrsbdm priced Þqbdm ;

ð25:6Þ

bm 2B d¼1

where j ¼ jGj.

25.3.3 Forming Buyer Group with Bundles of Items by Ants The proposed algorithm presented in this paper provides means for buyer coalition formation by ACO. There some restrictions to this paper. Buyers are quoted a buyer-specific price after they have seen the price list of all packages provided by sellers. The buyer coalition is formed concerning only the price attribute. And, the price per item is a monotonically decreasing function when the size of the package

25

Buyer Coalition Formation with Bundle of Items by ACO

303

Fig. 25.1 Representing the work of one ant for creating the trail of \3 package12 2package23 …[

is increasing big. Additionally, the rule of the coalition is that each buyer is better forming a group than buying individually. The buyer coalition could not be formed if there is no utility earned from forming the group buyer. The first step for forming buyer coalition with bundles of items is to represent the problem as a graph where the optimum solution is a certain way through this graph. In Fig. 25.1, the solid line represents a package selected by the ant k. If the selected package is picked more than one, the ant k moves longer along the solid line. Then, the ant k deposits and updates the pheromone on the selected number of the specific package. In this particular problem, the ant randomly chooses the other package which is represented by a dotted line. The probability of selecting i units of packages jth is pkij formally defined below: pkij

¼

8 < :

sa n

b

P Pij j ij i

0

b sa n 12D il il

if j 2 D; the set of packages offered by all sellers which have not been selected; otherwise,

ð25:7Þ where Dskij is the intensity of the pheromone on the solid line. For instance, at the starting point if the ant k has selected three sets (j = 3) of package12 , the current ant deposits its pheromone only on the package12 , at the unit of 3. The ant k keeps moving along the path until all of the buyers’ requests are matched. The possible resulting of the algorithm is shown in the Fig. 25.1. The quantity of pheromone Dskij is defined as follow: Q=U k if i units of package j is used by the ant k ð25:8Þ Dskij ¼ 0 otherwise where Q is equal to one, and U k is the total utility of a coalition derived from the ant k. Keep in mind that the gij is given by

304

A. Sukstrienwong

8P < mij =uij if some items in theselected package are unmatched to thebuyers’requests; gij ¼ 1 if all items in the selected package are totally matched to thebuyers’ request : 0 otherwise

ð25:9Þ where mij is the total number of items in the selected packages which is matched to the buyer’s requests, and uij is the total number of items in the selected package. At the beginning all of the pheromone values of each package line are initialized to the very small value c, 0 \ c B 1. After initializing the problem graph with a small amount of pheromones and defining each ant’s starting point, a small number of ants run for a certain number of iterations. For every iteration, each ant determines a path through the graph from its starting point to the solid package line. The measurement of the quality of a solution found by the ACO is calculated according to the total utility of coalitions in Eq. 25.6.

25.3.4 GroupBuyACO Algorithm

25.7

25.8

25.9

25

Buyer Coalition Formation with Bundle of Items by ACO

305

Table 25.3 Data settings for GroupBuyACO algorithm Constant Detail

Value

NumOfBuyer NumOfSeller MaxNumPackageSeller NumOfTypeInPackage

10 3 5 4

No. of buyers No. of sellers Max no. of packages for each seller No. of product type in pacakage

This section shows the implementation of ACO algorithm for forming buyer group with bundles of items called the GroupBuyACO algorithm. The proposed algorithm can be described by the following algorithm:

25.4 Experimental Results This section demonstrates the initial data setting of the simulation for forming buyer coalition by the proposed algorithm, GroupBuyACO algorithm. The algorithm has tried several of runs with different numbers of artificial ants, values of a and b, and evaporation rate (q) to find which values would steer the algorithm towards the best solution.

25.4.1 Initial Data Settings The experimental results of the proposed algorithm are derived from a simulation which has implemented more than 4,000 lines of C++ program. It is run on a Pentium(R) D CPU 2.80 GHz, 2 GB of RAM, IBM PC. The simulation program for the GroupBuyACO algorithm is coded in C++ programming language. Table 25.3 summarizes the initial data settings for GroupBuyACO algorithm in the simulation. In order to get the best experimental results, for this example, the buyers’ orders with the reservation price are selected randomly to demonstrate that the proposed algorithm is possible to works in the real-world data. Three different sellers offer to sell various packages which are pure bundling packages. Table 25.4 shows the products and price list offered by individual seller. Seller s1 offers six packages. First four packages are one-item package. The rest are two-item package. The average number of items per package of s1 is ð4 1 þ 2 2Þ=6 ¼ 1:33: The seller s2 combines two items of products in one package, so the average item per package is two. Seller s3 has offered four packages of three items, so the average items per package for s3 is three. And, there are ten buyers participating in the group buying shown in Table 25.5.

306

A. Sukstrienwong

Table 25.4 The price list example for the simulation Sellers Package numbers Product types A s1

s2

s3

package11 package12 package13 package14 package15 package16 package25 package26 package25 package26 package31

pack – – – pack – – pack pack – pack

package32 package33 package34

– pack of 1 pack of 1

of 1

of 1

of 1 of 1 of 1

Price ($)

B

C

D

– pack – – pack pack – – – pack pack

– – pack – – pack pack pack – – pack

– – – pack – – pack – pack pack –

of 1

of 1 of 1

of 1 of 1

pack of 1 – pack of 1

of 1

of 1 of 1 of 1

of 1

pack of 1 pack of 1 –

of 1

of 1 of 1 of 1

pack of 1 pack of 1 pack of 1

1,000 1,000 1,000 1,000 1,950 1,900 1,925 1,950 1,920 1,970 2,700 2,690 2,750 2,700

Table 25.5 Buyer orders bBuyers Buyer’s order (Number of items 9 (Reservation prices $)) b1 b2 b3 b4 b5 b6 b7 b8 b9 b10

A

B

C

– 1 9 (960.0) – 2 9 (969.0) – – – – – 1 9 (965.0)

– 1 9 (975.0) – – 1 9 (955.0) – – 4 9 (970.0) – –

1 – 1 – 1 – 2 – – –

D 9 (970.0) 9 (1000.0) 9 (960.00) 9 (980.0)

– – – – – 1 9 (980.00) – – 1 9 (989.0) –

25.4.2 The GroupBuyACO Algorithm Performance The first two parameters to be studies are a and b. As shown in Eq. 25.7, these parameters are related to the probability of selecting i units of packages jth (pkij ) because a is the exponent of Dskij and b is the exponent of gij . Thus the corresponding variations in the values of both a and b might play an importance role on the GroupBuyACO algorithm. Let both a and b value range from 0.5 to 3, and the number of iterations is 200. The resulting of corresponding variation in the values of a and b is shown in Table 25.6. The best result is shown in bold. It can be seen

25

Buyer Coalition Formation with Bundle of Items by ACO

Table 25.6 The average of group’s utility derived from corresponding in the values of a and b, iteration number = 2,000

a 0 0.5 1 2 3

307

b 0.5

1

2

3

759.06 755.25 623.57 927.24 554.65

457.22 594.23 757.48 907.09 657.84

791.33 814.71 698.21 456.98 569.24

673.21 734.72 542.01 671.45 459.27

Fig. 25.2 Number of iterations where initial settings a ¼ 2, b ¼ 0:5, and q ¼ 0:1

Table 25.7 The comparison of GroupBuyACO algorithm with the genetic algorithm

GroupBuyACO algorithm ($)

GroupPackageString ($)

927.11

909.74

that the average utility of the group earned by GroupBuyACO algorithm was high when a ¼ 2 and b ¼ 0:5. Evaporation rate q of the pheromone is one of the most important variables for the GroupBuyACO algorithm. From Fig. 25.2, it can be seen that when the value of q is approximately 0.1, the total utility earned from the group buying is the highest. The proposed algorithm compared with the GAGroupBuyer scheme by Sukstrienwong [16]. In order to evaluate the performance of GroupBuyACO, the default configuration of parameters were set to the following values: a ¼ 2, b ¼ 0:5 and q ¼ 0:1. From Table 25.7, the GroupBuyACO algorithm outperforms GroupPackageString.

25.5 Conclusions and Future Work In this paper, a new method for buyer coalition formation with bundle of items by ant colony optimization technique is proposed. The aim of the proposed algorithm is to form a buyer coalition in order to maximize the group’s total utility. The ants

308

A. Sukstrienwong

construct the trail by depositing pheromone after moving through a path and updating pheromone value associate with good or promising solutions through the edges of the path. From the experimental results, it is observed that the proposed algorithm is effective in dealing with finding best buyer coalitions with bundles of items. The solution quality of GroupBuyACO algorithm is shown by comparing with the genetic algorithm technique called GroupPackageString scheme. The experimental results show that the GroupBuyACO algorithm is able to yield better results than GAGroupBuyer scheme. However, the proposed algorithm has some restrictive constraints of forming a buyer coalition as follow: (1) all buyers quote specific prices for their requested products after they have seen the price list of all packages provided by sellers. (2) The buyer coalition is formed concerning only the price attribute. (3) And, the price per item is a monotonically decreasing function when the size of the package is increasing big. These restrictions can be extended to investigate in future researches.

References 1. Alipour H, Khosrowshahi Asl E, Esmaeili M, Nourhosseini M (2008) ACO-FCR: Applying ACO-based algorithms to induct FCR, Lecture note in engineering and computer science: proceedings of the world congress on engineering 2008, 2–4 July, London, UK, pp 12–17 2. Dana J (2004) Buyer groups as strategic commitments mimeo. Northwestern University, USA 3. Dorigo M, Gambardella LM (1997) Ant colony system: a cooperative learning approach to the traveling salesman problem. IEEE Trans Evolut Comput 1(1):53–66 4. Dorigo M, Di Caro G (1999) The ant colony optimization metaheuristic. In: Corne D et al (eds) New ideas in optimization. McGraw Hill, London, pp 11–32 5. Goss S, Beckers R, Deneubourg JL, Aron S, Pasteels JM (1990) How trail laying and trail following can solve foraging problems for ant colonies. In: Hughes RN (ed) Behavioural mechanisms of food selection NATO-ASI Series, G 20. Springer, Berlin 6. Gurler U, Oztop S, Sen A (2009) Optimal bundle formation and pricing of two products with limited stock. J Int J Prod Econ, 7. Hölldobler B, Wilson EO (1990) The Ants. Springer, Berlin, p 732 8. Hyodo M, Matsuo T, Ito T (2003)An optimal coalition formation among buyer agents based on a genetic algorithm. In: 16th international conference on industrial and engineering applications of artificial intelligence and expert systems (IEA/AIE’03), Laughborough, UK, pp 759–767 9. Ismail M, Nur Hazima FI, Mohd. Rozely K, Muhammad Khayat I, Titik Khawa AR, Mohd Rafi A (2008) Ant colony optimization (ACO) technique in economic power dispatch problems. Lecture note in engineering and computer science: proceedings of the international multiconference of engineers and computer scientists, 19–21 March 2008, Hong Kong, pp 1387–1392 10. Ito T, Hiroyuki O, Toramatsu S (2002) A group buy protocal based on coalition formation for agent-mediated e-commerce. IJCIS 3(1):11–20 11. Laor B, Leung HF, Boonjing V, Dickson KW (2009) Forming buyer coalitions with bundles of items. In: Nguyen NT, Hakansson A, Hartung R, Howlett R, Jain LC (eds.) KES-AMSTA 2009. LNAI 5559-0717 Springer, Heidelberg, pp 121–138 12. Lawler EL, Lenstra JK, Rinnooy-Kan AHG, Shmoys DB (eds) (1985) The traveling salesman problem. Wiley, New York

25

Buyer Coalition Formation with Bundle of Items by ACO

309

13. Li C, Sycara K (2007) Algorithm for combinatorial coalition formation and payoff diversion in an electronic marketplace. In: Proceedings of the first international joint conference on autonomous agents and multiagent systems, pp 120–127 14. Mahdi S (2007) Negotiating agents in e-commerce based on a combined strategy using genetic algorithms as well as fuzzy fairness function. In: Proceedings of the world congress on engineering, WCE 2007, vol I. 2–4 July 2007, London, UK 15. Maniezzo V, Colorni A, Dorigo M (1994) The ant system applied to the quadratic assignment problem. Université Libre de Bruxelles, Belgium, Tech. Rep. IRIDIA/94-28 16. Sukstrienwong A (2010), Buyer formation with bundle of items in e-marketplaces by genetic algorithm. Lecture note in engineering and computer science: proceedings of the international multiconference of engineers and computer scientists 2010, IMECS 2010, 17–19 March 2010, Hong Kong, pp 158–162 17. Sukstrienwong A (2010) Ant colony optimization for buyer coalition with bundle of items. Lecture notes in engineering and computer science: proceedings of the world congress on engineering 2010, WCE 2010, 30 June–2 July, London, UK, pp 38–43 18. Tsvetovat M, Sycara KP, Chen Y, Ying J (2001)Customer coalitions in electronic markets. Lecture notes in computer science, vol 2003. Springer, Heidelberg, pp 121–138 19. Yamada T, Reeves CR (1998) Solving the Csum permutation flowshop scheduling problem by genetic local search. In: Proceedings of 1998 ieee international conference on evolutionary computation, pp 230–234 20. Yamamoto J, Sycara K (2001) A stable and efficient buyer coalition formation scheme for e-marketplaces. In: Proceedings of the 5th international conference on autonomous agents, Monttreal, Quebec, Canada, pp 576–583

Chapter 26

Coevolutionary Grammatical Evolution for Building Trading Algorithms Kamal Adamu and Steve Phelps

Abstract Advancements in communications and computer technology has enabled traders to program their trading strategies into computer programs (trading algorithms) that submit electronic orders to an exchange automatically. The work in this chapter entails the use of a coevolutionary algorithm based on grammatical evolution to produce trading algorithms. The trading algorithms developed are benchmarked against a publicly available trading system called the turtle trading system (TTS). The results suggest that out framework is capable of producing trading algorithms that outperform the TTS. In addition, a comparison between trading algorithms developed under a utilitarian framework, and using Sharpe ratio as objective function shows that they have statistically different performance.

26.1 Introduction Traders make trade decisions specifying entry, exit, and stop loss prices [1]. The entry is the price at which the trader wishes to enter the market, the exit is the price at which the trader expects to take profit, and the stop loss is the price at which the trader wants to exit a position when a trade is not in her favour [1]. A set of entry, exit, and stop loss rules is referred to as a trading system and

K. Adamu (&) S. Phelps Center for Computational Finance and Economic Agents, University of Essex, Colchester, UK e-mail: [email protected] S. Phelps e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_26, Springer Science+Business Media B.V. 2011

311

312

K. Adamu and S. Phelps

there exists an interdependency between these rules [1]. A trader that consistently fails to exit a loosing trade when they have incurred a tolerable amount of loss will almost certainly be wiped out after a couple of loosing trades. Moreover, a trader that takes profit too early or too late before making a required amount of profit will have very little to cover their costs and loss or lose part of the profit she has made [1]. Technicians decide on entry, exit, and stop loss prices based on technical trading rules [1]. Advancements in communication and computer technology has allowed traders to submit trades electronically using computer programs (Trading algorithms) that sift through a vast amount of information looking for trade opportunities [2]. Trading algorithms have gained popularity due to their cost effective nature [2]. According to Hendershott et al. [2] 75% of trades executed in the US in 2009 were by trading algorithms. The aim of the work in this chapter is to test if a methodology based on grammatical evolution (GE) [3] can be used to coevolve rules for entry, exit, and stop loss that outperform a publicly available trading system called the turtle trading system in high frequency [1, 4]. This chapter also tests if trading algorithms developed under a utilitarian framework have the similar performance as trading algorithms developed using the Sharpe ratio as objective function. Adamu and Phelps, Saks and Maringer [5, 6] employ cooperative coevolution in developing technical trading rules. In this chapter, we coevolve rules that form trading algorithms using GE for high frequency trading. The trading algorithms evolved are benchmarked against the turtle trading system. In addition, the effect of various objective functions on the trading algorithms evolved is considered. The rest of this chapter is organised as follows. Section 26.2 gives a survey on the turtle trading system, and investor preference . We explain our framework in Sect. 26.3 and present the data used for the study in Sect. 26.4 Our result is presented in Sect. 26.5 and the Chapter ends with a summary in Sect. 26.6.

26.2 Background 26.2.1 The Turtle Trading System A trading system is a set of rules that signal when to enter, and exit a position where a position is a steak in a particular asset in a particular market [1]. The rules in trading systems specify when to enter the market when prices are expected to fall, when to enter the market when prices are expected to rise, and how to minimise loss and maximise profit (Money management) [1]. The entry rules for the turtle trading system are specified as follows [1, 4]:

26

Coevolutionary Grammatical Evolution

313

Ht ; t 2 f1; 2; 3; 4; . . .; T g is the current highest price, and Lt ; t 2 f1; 2; 3; 4; . . .; T g is the current lowest price. The TTS places the initial stop loss at entry using the following equation: Stopt1 2ATRt if Long ð26:1Þ Stopt ¼ Stopt1 þ 2ATRt if short: where ATRt is the current average true range and it is calculated as follows: ATRt ¼

19Nt1 TRt 20

ð26:2Þ

TRt ; t 2 f1; 2; 3; 4; . . .; T g is the true range and its calculated as follows: TRt ¼ maxðHt Lt ; Ht Ct1 ; Ct1 Lt Þ

ð26:3Þ

Ct ; t 2 f1; 2; 3; 4; . . .; T g is the price at the end of the time interval t; t 2 f1; 2; 3; . . .; T g.

26.2.2 Investor Preference 26.2.2.1 Utility Theory Traditional finance and economics postulates that the financial markets are populated by rational, risk averse agents that prefer more wealth to less wealth [7]. One of the corner stones of efficient markets is the presence of the homo econimucus, the rational risk averse agent with preference for more wealth than less wealth [6, 7] . In utilitarian terms this translates to expected utility 0 00 maximising investors with utility functions that satisfy UðWÞ [ 0 and UðWÞ \0 where U(W) is the utility of wealth, and W is the current level of wealth [7]. The power utility function, negative exponential utility function, and quadratic utility 0 00 function satisfy UðWÞ [ 0 and UðWÞ \0.

314

K. Adamu and S. Phelps

The Power utility function is defined by the following equation [7] UðWÞ ¼

W 1c ; 1c

c [ 0;

c 6¼ 1

ð26:4Þ

c controls the degree of risk aversion of the utility function. There is evidence to suggest that traders exhibit loss aversion as well as risk aversion [6] hence in this chapter, a modified wealth is used to account for loss aversion [6]. The power utility function (PUF) is then defined as follows [6]: UðWi Þ ¼ Wi ¼

Wi 1c 1 ; 1c 1c

W0 ð1 þ vi Þ w0 ð1 þ vi Þk

c[1

ð26:5Þ

vi [ 0 vi \0; k [ 1

ð26:6Þ

where vi is the simple return for trade interval i; i 2 f1; 2; . . .; Ng; Wi is a modified level of wealth for the given trade interval i; i 2 f1; 2; . . .; Ng. For this study we consider the case of a unit investor and set the initial level of wealth W0 ¼ 1: k, and c define the risk, and loss preference of the agents respectively. Quadratic utility function. The quadratic utility function (QUF) is given by the following equation [7]: b UðWi Þ ¼ Wi Wi2 ; 2

ð26:7Þ

b[0 0

where W is the wealth. To satisfy the condition of UðWÞ [ 0, we set W ¼ 1b for levels of wealth W [ 1b. The Negative exponential utility function is given by the following equation [7]: UðWi Þ ¼ a becWi ;

c[0

ð26:8Þ

26.2.2.2 Sharpe Ratio It follows that, provided a utility function satisfies UðWÞ0 [ 0 and UðWÞ00 \0 then it suffices to look at the mean and variance of the outcome of investments, regardless of the distribution of the outcome of investments [8]. One measure that takes mean of return, and standard deviation of return into account is the Sharpe ratio [8]. The Sharpe ratio is defined by the following formula [8]: lr rf rr

ð26:9Þ

rf is the risk free rate of interest (this is negligible in high frequency), and lr and rr are the mean and standard deviation of return respectively. A high Sharpe ratio implies a high mean return per unit risk and vice-versa.

26

Coevolutionary Grammatical Evolution

315

26.3 Framework Our framework develops trading algorithms of the form:

Our framework coevolves the entry, exit, and stoploss rules for long and short positions respectively. Each set of rule is a species on its own. We denote the species of entry, exit, and stop loss rules for long positions as EkL ; k 2 f1; 2; 3; . . .; Ng, CkL ; k 2 f1; 2; 3; . . .; Ng, and SLk ; k 2 f1; 2; 3; . . .; Ng respectively. EkS ; k 2 f1; 2; 3; . . .; Ng, CkS ; k 2 f1; 2; 3; . . .; Ng, and SSk ; k 2 f1; 2; 3; . . .; Ng is the notation for entry,exit, and stop loss rules for short positions. The transition table for the algorithm given above is in Table 26.1. GE is used to evolve species within each population. Sexual reproduction is interspecies and we employ an implicit speciation technique within each species [9]. Rules are spatially distributed on notional toroid and sexually reproduce with rules within their deme [9]. This is akin to individuals sharing information with individuals within their social circle. The deme of a rule k is a set of individuals within the immediate vicinity of rule k on the imaginary toroid. Collaborators are chosen from other species based on an elitist principle [10]. For instance, when assessing a solution from the set EkL ; k 2 f1; 2; 3; . . .; Ng, the best from CkL ; k 2 f1; 2; 3; . . .; Ng, SLk ; k 2 f1; 2; 3; . . .; Ng, EkS ; k 2 f1; 2; 3; . . .; Ng, CkS ; k 2 f1; 2; 3; . . .; Ng, and SSk ; k 2 f1; 2; 3; . . .; Ng are chosen for collaboration and the fitness attained is

316

K. Adamu and S. Phelps

Table 26.1 Transition table for entry, exit, and stop loss rules Current EkL CkL SLk EkS CkS position

SSk

Action

Long Long

X X

0 0

0 1

X X

X X

X X

Long

X

1

0

X

X

X

Long

X

1

1

X

X

X

Short Short

X X

X X

X X

X X

0 0

0 1

Short

X

X

X

X

1

0

Short

X

X

X

X

1

1

Neutral Neutral

0 0

X X

X X

0 1

X X

X X

Neutral

1

X

X

0

X

X

Neutral

1

X

X

1

X

X

Do nothing Close long position Close long position Close long position Do nothing Close short position Close short position Close short position Do nothing Open short position Open long position Donothing

X stands for ignore

assigned to rule k. The platform for collaboration and fitness evaluation is a trading algorithm. Each species asserts evolutionary pressure on the other and rules that contribute to the profitability of the trading algorithm attain high fitness and survive to pass down their genetic material to their offspring. On the other hand, rules that do not contribute are awarded low fitness and are eventually replaced by solutions with higher fitness. Selection occurs at the population level such that for each species a tournament is performed and if the fitness of a rule is less than the fitness of its offspring then it is replaced by its offspring. This can be expressed formally using the following equation: y If fy [ fx ð26:10Þ x¼ x otherwise

26.3.1 Objective Function The following assumptions are implicit in the fitness evaluation: 1. Only one position can be traded at any instant. 2. Only one unit can be traded at any instant. 3. The is no market friction (zero transaction cost, zero slippage, zero market impact). Arguably, since only one unit is traded at any instant, the effect of market impact can be considered to be negligible.

26

Coevolutionary Grammatical Evolution

317

26.3.1.1 Sharpe Ratio The Sharpe ratio is computed using Eq. 26.10. The objective is then to maximise: lk max r ð26:11Þ rr k lkr , and rkr are the mean and standard deviation of trading algorithm k; k 2 f1; 2; 3; . . .; Pg:

26.3.1.2 Expected Utility The objective function when using utility functions is the expected utility which is calculated as follows: f ¼ EðUðWÞÞ ¼

30 X N 1X UðWi ; hj Þ N j¼1 i¼1

ð26:12Þ

where UðWi ; hj Þ is the utility of wealth at interval i; ı 2 f1; 2; 3; . . .; Ng given the vector of parameters for the utility function hj , and N is the number of trading intervals. The utility for each interval is calculated for a range of parameter values (see Sect. 26.2 for parameter settings). The objective in the utilitarian framework can be formally expressed as follows: max EðUðWi ; hj ÞÞ; i 2 f1; 2; 3; . . .Ng hj

ð26:13Þ

26.3.2 Parameter Settings The population size P of each species is set to 100 and the coevolutionary process is allowed to happen for a maximum number of generations, MaxGen = 200. The coevolutionary process is terminated after MaxGen/2 generations, if there is no improvement in the fitness of the elitist (best solution) of the best solutions from each species. The deme size for each species is set to 11. The grammar used in mapping the entry, and exit rules of the trading algorithms is shown in Table 26.2. The grammar used in mapping the stop loss rules of the trading systems is shown in Table 26.3. In our notation, O(t-n:t-1) represents a set of open prices, C(t-n:t-1) represents a set of closing prices, H(t-n,-1) represents a set of highest prices, and L(t-n:t-1) represents a set of lowest prices between t-n and t-1. O(t-n) represents the open price at t-n, C(t-n) represents the closing price at t-n, H(t-n) represents the highest price at t-n, and L(t-n) represents the lowest price at t-n. Where n 2 f10; 11; 12; . . .; 99g and t 2 f1; 2; . . .g. sma(•), and ema(•) stand for simple, and exponential moving average respectively.

318

K. Adamu and S. Phelps

Table 26.2 Grammar for mapping EkL , EkS , CkL , and CkS / Rule \expr [ :: \rule [ :: \binop [ :: \var [ :: \op [ :: \window [ :: \integer [ :: \fun [ ::

\binop [ ð\expr [ ; \expr [ Þ \rule [ \var [ \op [ \var [ \fun [ \op [ \fun [ and, or, xor H(t-\window [ ) O(t-\window [ ) [ ; \; ¼; ; ; \integer [ \integer [ 1, 2, 3, 4, 5, 6, 7, 8, 9 sma(H(t-\window [ :t-1)) max(H(t-\window [ :t-1)) sma(L(t-\window [ :t-1)) max(L(t-\window [ :t-1)) sma(O(t-\window [ :t-1)) max(O(t-\window [ :t-1)) sma(C(t-\window [ :t-1)) max(C(t-\window [ :t-1))

n (2) \var [ \op [ \fun [ (3) (3) L(t-\window [ ) C(t-\window [ ) (4) (1) (9) ema(H(t-\window [ :t-1)) min(H(t-\window [ :t-1)) ema(L(t-\window [ :t-1)) min(L(t-\window [ :t-1)) ema(O(t-\window [ :t-1)) min(O(t-\window [ :t-1)) ema(C(t-\window [ :t-1)) min(C(t-\window [ :t-1))

(15)

/ is the set of non-terminals, and n is the n is the number of rules for mapping the non-terminal /

Table 26.3 Grammar for mapping SLk , and SSk / Rule \expr [ :: \rule [ ::

\preop [ :: \var [ ::

\fun [ ::

\window [ :: \integer [ ::

\preop [ (\expr [ ,\expr [ ) \rule [ \rule [ \op [ \rule [ \var [ \op [ \fun [ \fun [ min, max H(t-\window [ ) L(t-\window [ ) O(t-\window [ ) C(t-\window [ ) sma(H(t-\window [ :t-1)) max(H(t-\window [ :t-1)) sma(L(t-\window [ :t-1)) max(L(t-\window [ :t-1)) sma(O(t-\window [ :t-1)) max(O(t-\window [ :t-1)) sma(C(t-\window [ :t-1)) max(C(t-\window [ :t-1)) \integer [ \integer [ 1, 2, 3, 4, 5, 6, 7, 8, 9

n (2) \var [ \op [ \var [ \fun [ \op [ \fun [ \var [

(6) (2)

(4) ema(H(t-\window [ :t-1)) min(H(t-\window [ :t-1)) ema(L(t-\window [ :t-1)) min(L(t-\window [ :t-1)) ema(O(t-\window [ :t-1)) min(O(t-\window [ :t-1)) ema(C(t-\window [ :t-1)) min(C(t-\window [ :t-1))

/ is the set of non terminals, and n is the number of rules for mapping the non-terminal /

(15) (1) (9)

26

Coevolutionary Grammatical Evolution

319

Fig. 26.1 Average out-ofsample Sharpe ratio of trading systems produced under the assumption of PUF, NEUF, QUF, and sharpe ratio fitness functions

The parameters of the power utility function (PUF), k, and c are sampled within the following ranges. 1\k\2, and 1\c\35 where k controls the degree of loss aversion and c controls the degree of risk aversion. The parameters of the negative exponential utility function (NEUF), a, b, and c were sampled from the following ranges 1\a\35; 1\b\35, and 1\c\35. The parameter for the quadratic utility function (QUF) was sampled from the following ranges 1\b\35.

26.4 Data In this chapter, we use high frequency tick data for Amvesco for the period between 1 March 2007 and 1 April 2007 for our study. The data was compressed into a series of five minutely high, low, open, close prices proxy. The data was then divided into four blocks for k-fold cross validation [11].

26.5 Results and Discussion In this section, we present the results obtained from producing trading algorithms under the assumption of power utility function (PUF), negative exponential utility function (NEUF), quadratic utility function (QUF), and Sharpe ratio (SR) as objective function. Utility is not comparable across different utility functions hence, analysis is performed directly on the returns obtained by the trading algorithms. We take the average of the performance of the trading algorithms across different blocks in accordance with k-fold cross validation. Furthermore, the trading algorithms developed are compared to the turtle trading systems (TTS) (see Sect. 26.2.1. The comparison will serve as a test for the hypothesis that trading systems developed using our framework are able to outperform the turtle trading system. In addition, the trading algorithms developed are compared to a set of randomly initialised trading algorithms (MC). The randomly initialised trading algorithms

320

K. Adamu and S. Phelps

Table 26.4 Kruskal-Walliss ANOVA test results for out-of-sample average Sharpe ratios of agents produced under assumption of PUF, NEUF, QUF, and Sharpe ratio as fitness functions p-value v2 85.640

0.000

Table 26.5 Kruskal-Walliss ANOVA test results for the null hypothesis that the out-of-sample Sharpe ratios of agents produced under assumption of PUF, NEUF, QUF, and Sharpe ratio as fitness functions is the same as a set of random strategies (MC) Objective function PUF NEUF QUF SR v2 p-value

19.960 0.000

1.400 0.237

3.110 0.078

8.340 0.040

were mapped using the grammar used to coevolve our trading algorithms. The comparison will test if the performance of the trading systems can be reproduced by chance. Figure 26.1 depicts the cumulative distribution function of the average Sharpe ratios of trading algorithms produced under the assumption of PUF, NEUF, QUF, and Sharpe ratio as fitness functions. Figure 26.1 suggests that, given the assumption of no budget constraints and frictionless markets, trading systems produced under the assumption of PUF, and SR have a better reward to risk ratio (Sharpe ratio) for the data set considered. A kruskal-Walliss ANOVA test for the null hypothesis that the Sharpe ratios of agents produced under assumption PUF, NEUF, QUF, and SR as fitness functions are the same was performed and the test results are shown in Table 26.4. The results in Table 26.4 show that, trading systems produced under the assumption of PUF, NEUF, QUF, and SR produce Sharpe ratios that are statistically different from each other. The results in Table 26.4 support the results in Fig. 26.1. Traditional investment theory postulates that provided investors have utility functions that satisfy the assumption of risk aversion and non-satiation then irrespective of their utility functions the mean and standard deviation of the outcomes of their investments are enough to summarise the outcomes of the distribution of outcomes. All the utility functions employed in this chapter satisfy the assumption of risk aversion and non-satiation. The results in Fig. 26.1; however, show that there is a difference between trading systems developed using different utility functions, and trading systems developed using the Sharpe ratio. Table 26.5 shows results from the Kruskal-Wallis ANOVA test for the null hypothesis that, the Sharpe ratios of the agents produced under the assumption of PUF, NEUF, QUF, and SR as objective function are not different from a set of randoml strategies (MC). The results in Table 26.5 suggests that trading systems produced under the assumption of PUF, QUF, and SR produce Sharpe ratios that are significantly different from a set of randomly initialised strategies. This implies performance of these trading systems is highly unlikely to have resulted out of pure chance. To test the hypothesis that, our framework can be used to produce trading algorithms that outperform the turtle trading system, the performance of the

26

Coevolutionary Grammatical Evolution

321

Table 26.6 Sign test results for the null hypothesis that the out-of-sample Sharpe ratios of trading systems produced under assumption PUF, NEUF, QUF, and SR as fitness functions come from a continuous distribution with a median that is same as the Sharpe ratio of the TTS Objective z-value sign pfunction value PUF NEUF QUF SR

1.839 -6.418 -4.000 -6.647

18 1 10 1

0.000 0.000 0.000 0.000

trading systems developed is compared to the performance of the turtle trading system using a sign test. Table 26.6 contains results from a sign-test for the null hypothesis that average out-of-sample Sharpe ratios of the trading systems developed under assumption of power utility function (PUF), negative exponential utility function (NEUF), quadratic utility function (QUF), and Sharpe ratio (SR) as objective function are not any different from the turtle trading system (TTS). The results in Table 26.6 suggests that trading systems produced under the assumption of PUF as objective function produced Sharpe ratios that are significantly better than the Sharpe ratio of the TTS.

26.6 Chapter Summary Advancements in communication and computer technology has allowed trading systems to be programmed into computer programs that execute orders and this has gained a lot of popularity [2]. In this chapter, we introduced a method based on GE for coevolving technical trading rules for high frequency trading (see Sect. 26.3 for the method). Our results suggests our framework is capable of producing trading algorithms that outperform the turtle trading system under no budget constraint when using power utility function as objective function. The results in this chapter show that there is a significant difference between the performance of trading systems that were produced under the assumption of PUF, NEUF, QUF, and Sharpe ratio as objective function. This suggests that coevolutionary approach is highly sensitive to the objective function chosen.

References 1. Faith C (2003) The original turtle trading rules. http://www.originalturtle.org 2. Hendershott T, Jones CM, Menkveld AJ (2011) Does algorithmic trading improve liquidity. J Finance 66(1):1–33 3. O’Neill M, Brabazon A, Ryan C, Collins JJ (2001) Evolving market index trading tules using grammatical evolution. Appl Evolut Comput, Lect Notes Comput Sci 2037(2001):343–352

322

K. Adamu and S. Phelps

4. Anderson JA (2003) Taking a peek inside the turtle’s shell. School of Economics and Finance, Queensland University of Technology, Australia 5. Adamu K, Phelps S (2010) Coevolution of technical trading rules for high frequency trading. Lecture notes in computer science and engineering, proceedings of the world congress on engineering, WCE 2010(1):96–101 6. Saks P, Maringer D (2009) Evolutionary money management. Lecture notes in computer science, Applications of evolutionary computing, vol 5484(2009). Springer, Heidelberg pp 162–171 7. Cuthbertson K, Nitzsche D (2004) Quantitative financial economics. 2nd edn. Wiley, Chichester pp 13–32 (chapter 1) 8. Amman H, Rusten B, (eds) (2005) Portfolio management with heuristic optimization. Advances in computational management sciences, vol 8. Springer, Berlin pp 1–37 (Chapter1) 9. Eiben AE, Smith JE (2003) Introduction to evolutionary computing. Springer, Berlin (Chapter 9) 10. Wiegand R, Paul C, Liles W, JongKenneth A De (2001) An empirical analysis of collaboration methods in cooperative coevolutionary algorithms. In: Proceedings of the genetic and evolutionary computation conference, Morgan Kaufmann Publishers 11. Kohavi R (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. Int Joint Conf Artif Intell 14(2):1137–1145

Chapter 27

High Performance Computing Applied to the False Nearest Neighbors Method: Box-Assisted and kd-Tree Approaches Julio J. Águila, Ismael Marín, Enrique Arias, María del Mar Artigao and Juan J. Miralles

Abstract In different fields of science and engineering (medicine, economics, oceanography, biological systems, etc.) the false nearest neighbors (FNN) method has a special relevance. In some of these applications, it is important to provide the results in a reasonable time scale, thus the execution time of the FNN method has to be reduced. To achieve this goal, a multidisciplinary group formed by computer scientists and physicists are collaborative working on developing High Performance Computing implementations of one of the most popular algorithms that implement the FNN method: based on box-assisted algorithm and based on kd-tree data structure. In this paper, a comparative study of the distributed memory architecture implementations carried out in the framework of this collaboration is

J. J. Águila (&) E. Arias Albacete Research Institute of Informatics, University of Castilla-La Mancha, Avda. España s/n, 02071 Albacete, Spain e-mail: [email protected] E. Arias e-mail: [email protected] I. Marín M. del Mar Artigao J. J. Miralles Applied Physics Department, University of Castilla-La Mancha, Avda. España s/n, 02071 Albacete, Spain e-mail: [email protected] M. del Mar Artigao e-mail: [email protected] J. J. Miralles e-mail: [email protected] J. J. Águila Depto. Ingeniería en Computación, Universidad de Magallanes, Avda. Bulnes, 01855 Punta Arenas, Chile

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_27, Springer Science+Business Media B.V. 2011

323

324

J. J. Águila et al.

presented. As a result, two parallel implementations for box-assisted algorithm and one parallel implementation for the kd-tree structure are compared in terms of execution time, speed-up and efficiency. In terms of execution time, the approaches presented here are from 2 to 16 times faster than the sequential implementation, and the kd-tree approach is from 3 to 7 times faster than the boxassisted approaches.

27.1 Introduction In nonlinear time series analysis the false nearest neighbors (FNN) method is crucial to the success of the subsequent analysis. Many fields of science and engineering use the results obtained with this method. But the complexity and size of the time series increase day to day and it is important to provide the results in a reasonable time scale. For example, in the case of electrocardiogram study (ECG), this method have to achieve real-time performance in order to take some prevention actions. With the development of the parallel computing, large amounts of processing power and memory capacity are available to solve the gap between size and time. The FNN method was introduced by Kennel et al. [1]. Let X ¼ fxðiÞ : 0 i\ng a time series. We can construct points (delay vectors) according to yðiÞ ¼ ½xðiÞ; xði þ sÞ; . . .; xði þ ðd 1ÞsÞ

ð27:1Þ

where s is the embedding delay and d is the embedding dimension [2]. The Takens embedding theorem [3] states that for a large enough embedding dimension d m0 ; the delay vectors yield a phase space that has exactly the same properties as the one formed by the original variables of the system. The FNN method is a tool for determining the minimal embedding dimension m0 : Working in any dimension larger than the minimum leads to excessive computation when investigating any subsequent question (Lyapunov exponents, prediction, etc.). The method identifies the nearest neighbor yðjÞ for each point yðiÞ: According to Eq. 27.2, if the normalized distance is larger than a given threshold Rtr ; then the point yðiÞ is marked as having a false nearest neighbor. jxði þ dsÞ xðj þ dsÞj [ Rtr jjyðiÞ yðjÞjj

ð27:2Þ

Equation 27.2 has to be calculated for the whole time series and for several dimensions d ¼ f1; 2; . . .; mg until the fraction of points, which must be lower than Rtr ; is zero, or at least sufficiently small (in practice, lower than 1%). While greater is the value of n (length of the time series), the task to find the nearest neighbor for each point is more computationally expensive. A review of

27

High Performance Computing Applied to FNN Method

325

methods to find nearest neighbors, which are particularly useful for the study of time series data, can be found in [4]. We focused in two approaches: based on the box-assisted algorithm, optimized in the context of time series analysis by [5]; and the based in a kd-tree data structure [6, 7] developed in the context of computational geometry. According to Schreiber, for time series that have a low dimension of embedding (e.g. up to the 10’s), the box-assisted algorithm is particularly efficient. This algorithm can offer a lower complexity of OðnÞ under certain conditions. By the other hand, accordingly with the literature if the dimension of embedding is moderate an effective method for nearest neighbors searching consists in using a kd-tree data structure [6, 7]. From the computational theory point of view, the kd-tree-based algorithm has the advantage of providing an asymptotic number of operations proportional to Oðn log nÞ for a set of n points, which is the best possible performance for arbitrary distribution of elements. We have applied the paradigm of parallel computing to implement three approaches directed towards distributed memory architectures, in order to make a comparative study between the method based on the box-assisted algorithm and the method based on the kd-tree data structure. The results are presented in terms of performance metrics for parallel systems, that is, execution time, speed-up and efficiency. Two case studies have been considered to carried out this comparative study. A theoretical case study which consists on a Lorenz model, and a real case study which consists on a time series belonging to electrocardiography. The paper is organized as follows. After this introduction, a description of the considered approaches is introduced in Sect. 27.2. In Sect. 27.3, the experimental results are presented. Finally, in Sect. 27.4 some conclusions and future work are outlined.

27.2 Parallel Approaches We selected two programs to start this work: the false_nearest program based on the box-assisted algorithm [8, 9]; and the fnn program based on a kd-tree data structure [10]. We employ the paradigm Single-Program, Multiple Data (SPMD by [11]) to design the three parallel approaches. A coarse-grained decomposition [12] has been considered, i.e. we have a small number of tasks in parallel with a large amount of computations. The approaches are directed towards distributed memory architectures using the Message Passing Interface [13] standard for communication purpose. Two approaches are based on the box-assisted algorithm and the another approach is based on the kd-tree data structure.

326

J. J. Águila et al.

27.2.1 Approaches Based on Box-Assisted Algorithm The box-assisted algorithm [5] considers a set of n points yðiÞ in k dimensions. The idea of the method is as follow. Divide the phase space into a grid of boxes of side length : Each point yðiÞ lies into one of these boxes. The nearest neighbors there are located in the same box or in one of the adjacent boxes. The false_nearest program is a sequential implementation of the FNN method based on this algorithm. By profiling the false_nearest program in order to carry out the parallel approaches, four tasks were identified. Let X a time series, Y a set of points constructed according to Eq. 27.1, BOX an array that implements the grid of boxes (or mesh), and p the number of processes. Two parallel implementations were formed based on these four tasks: Domain decomposition Time series X is distributed to the processes. Two ways of distribution have been developed: Time Series (TS) and Mesh (M). In a TS data distribution the time series is split into p uniform parts of length np ; being n the length of the time series. In a M data distribution, each process computes the points that lie in its range of rows. The range of the mesh rows is assigned by ps ; where s is the size of the BOX. Grid construction The BOX array is filled. Two ways of grid construction have been developed: S (Sequential) and P (Parallel). In a S construction each process fills the BOX sequentially, thus each one has a copy. In a P construction each process fills a part of the group of boxes located over a set of assigned mesh rows. Nearest neighbors search Each process solves their subproblems given the domain decomposition way. In a TS data distribution each process uses the same group of points Y: In a M data distribution each process can use different groups of points. Communication of results Processes use MPI to synchronize the grid construction and to communicate the partial results at the end of each dimension. The approaches were called following the next nomenclature: DM-P-M meaning a Distributed Memory implementation considering that the grid construction is in Parallel and the time series is distributed according to the Mesh; DM-S-TS meaning a Distributed Memory implementation considering that the grid construction is Sequential and the Time Series is uniformly distributed to the processes. We have introduced MPI functions into the source codes to obtain the programs that can be run into a distributed memory platform. The most important MPI functions used in these programs are as follows: • MPI_Reduce Combines values provided from a group of MPI processes and returns the combined value in the MASTER process. • MPI_Allreduce Same as MPI_Reduce except that the result appears in all the MPI processes.

27

High Performance Computing Applied to FNN Method

327

Let p the total number of MPI processes. Each process has an identifier q ¼ f0; 1; . . .; p 1g: The process q ¼ 0 is treated as MASTER and processes with q 6¼ 0 are treated as slaves. The next algorithm depicts the algorithmic notation for the DM-P-M approach:

The next algorithm depicts the algorithmic notation for the DM-S-TS approach:

328

J. J. Águila et al.

27.2.2 Approach Based on the kd-Tree Data Structure A kd-tree data structure [6, 7] considers a set of n points yðiÞ in k dimensions. This tree is a k-dimensional binary search tree that represents a set of points in a k-dimensional space. The variant described in Friedman et al, distinguishes between two kinds of nodes: internal nodes partition the space by a cut plane defined by a value of the k dimensions (the one containing a maximum spread), and external nodes (or buckets) store the points in the resulting hyperrectangles of the partition. The root of the tree represents the entire k-dimensional space. The fnn program is a sequential implementation of the FNN method based on this structure.

27

High Performance Computing Applied to FNN Method

329

fnn program has been also analyzed by means of a profile tool before making the parallel implementation, identifying five main tasks. Thus, let X a time series, n the length of the time series, Y a set of points constructed according to Eq. 27.1, KDTREE a data structure that implements the kd-tree, p the number of processes, and q ¼ f0; 1; . . .; p 1g a process identifier. For convenience we assume that p is a power of two. The parallel implementation called KD-TREE-P was formed based on these five tasks: Global kd-tree building The first log p levels of KDTREE are built. All processors perform the same task, thus each one has a copy of the global tree. The restriction n p2 is imposed to ensure that the first log p levels of the tree correspond to nonterminal nodes instead of buckets. Local kd-tree building The local KDTREE is built. In the level log p of the global tree are p nonterminal nodes. Each processor q builds a local kd-tree using the ðq þ 1Þth-node like root. The first log p levels are destroyed and KDTREE is pointed to local tree. Domain decomposition Time series X is distributed to the processes. The building strategy imposes a distribution over the time series. Thus, the time series is split according to the kd-tree algorithm and the expected value of items contained in each local tree is approximately np: Nearest neighbors search Each process solves their subproblems. Each process searches the nearest neighbors for all points in Y that are in the local KDTREE. Communication of results Processes use MPI to communicate theirs partial results at the end of whole dimensions. The master process collects all partial results and reduces them. The next algorithm depicts the algorithmic notation for the KD-TREE-P approach:

330

J. J. Águila et al.

27.3 Experimental Results In order to test the performance of the parallel implementations, we have considered two case studies: the Lorenz time series generated by the equations system described in [14]; the electrocardiogram (ECG) signal generated by a dynamical model introduced in [15]. The Lorenz system is a benchmark problem in nonlinear time series analysis and the ECG model is used for biomedical science and engineering [16]. The parallel implementations have been run in a supercomputer called GALGO, which belongs to the Albacete Research Institute of Informatics [17]. The parallel platform consists in a cluster of 64 machines. Each machine has two processors Intel Xeon E5450 3.0 GHz and 32 GB of RAM memory. Each processor has 4 cores with 6,144 KB of cache memory. The machines are running RedHat Enterprise version 5 and using an Infiniband interconnection network. The cluster is presented as an unique resource which is accessed through a front-end node. The results are presented in terms of performance metrics for parallel systems described in [12]: execution time Tp ; speed-up S and efficiency E: These metrics are defined as follows: • Execution time The serial runtime of a program is the time elapsed between the beginning and the end of its execution on a sequential computer. The parallel runtime is the time that elapses from the moment that a parallel computation starts to the moment that the last processing element finishes its execution. We denote the serial runtime by Ts and the parallel runtime by Tp : • Speed-up is a measure that captures the relative benefit of solving a problem in parallel. It is defined as the ratio of the time taken to solve a problem in a single processing to the time required to solve the same problem on a parallel computer with p identical processing elements. We denote speed-up by the symbol S: • Efficiency is a measure of the fraction of time for which a processing element is usefully employed; it is defined as the ratio of speed-up to the number of processing elements. We denote efficiency by the symbol E: Mathematically, it is given by E ¼ Sp:

27

High Performance Computing Applied to FNN Method

331

Table 27.1 Size of BOX for each value of p using a Lorenz time series

p

DM-P-M

DM-S-TS

1 2 4 8 16 32

8,192 4,096 2,048 2,048 2,048 2,048

8,192 4,096 4,096 4,096 2,048 2,048

Table 27.2 Size of BOX for each value of p using a ECG time series

p

DM-P-M

DM-S-TS

1 2 4 8 16 32

4,096 4,096 4,096 2,048 2,048 2,048

4,096 4,096 4,096 4,096 2,048 2,048

Let p the number of processors, the execution time of the approaches have been tested for p ¼ f1; 2; 4; 8; 16; 32g; where p ¼ 1 corresponds to the sequential version of the approaches. We used one million records of the time series to calculate the ten first embedding dimensions. We have obtained that the optimal time delay for Lorenz time series is s ¼ 7 and for ECG signal is s ¼ 5 using the mutual information method. In order to obtain the best runtime of the approaches based in a box-assisted algorithm we found the best size of BOX for each value of p (Tables 27.1 and 27.2). The size of BOX defines the number of rows and columns for the grid of boxes. The values for p ¼ 1 corresponds to the sequential version of the program false_nearest. We have run ten tests to obtain the median value of the execution time Tp : In total 360 tests were performed. The performance metrics results are shown in Figs. 27.1 and 27.2. Sequential kd-tree implementation shows a lower execution time than boxassisted approach, since the grid construction stage on box-assisted implementation in TISEAN is very expensive in terms of execution time. The behavior of the Lorenz case study and the ECG case study is quite similar. Notice that, according to Figs. 27.1b and 27.2b, it is possible to appreciate a superlinear speed-up for kd-tree implementation when p\8 and these performance decreases when p [ 8: The super-linear speed-up is explained due to the fact that the cache memory is better exploited and that when the tree is split less searches have to be done at each subtree. With respect to the lost of performance, this situation is produced due to different causes. The first one is that, evidently, the overhead due to communications increases. Also, the most important cause is that the sequential part of our implementation becomes every time more relevant with respect to the parallel one.

332 Fig. 27.1 Performance metrics for the Lorenz case study: a execution time; b speed-up; c efficiency

J. J. Águila et al.

(a)

320 DM-P-M DM-S-TS KD-TREE-P

280 Tp (seconds)

240 200 160 120 80 40 0 1

2 4 8 16 p (Number of MPI processes)

32

(b) 14 12

DM-P-M DM-S-TS KD-TREE-P

S = Ts / Tp

10 8 6 4 2 0 1

2

4 8 16 p (Number of MPI processes)

32

(c)

E = (S / p) x 100 (%)

200 DM-P-M DM-S-TS KD-TREE-P

160 120 80 40 0 1

2 4 8 16 p (Number of MPI processes)

32

Considering only the box-assisted implementations, DM-S-TS is the boxassisted approach that provides the best results for the Lorenz attractor and the ECG signal. The reason is the very best data distribution with regard to DM-P-M.

27

High Performance Computing Applied to FNN Method

(a) 700 DM-P-M DM-S-TS KD-TREE-P

600 Tp (seconds)

Fig. 27.2 Performance metrics for the ECG case study: a execution time; b speed-up; c efficiency

333

500 400 300 200 100 0 1

2 4 8 16 p (Number of MPI processes)

32

(b) 18 DM-P-M DM-S-TS KD-TREE-P

S = Ts / Tp

15 12 9 6 3 0 1

2

4 8 16 p (Number of MPI processes)

32

(c)

E = (S / p) x 100 (%)

180 DM-P-M DM-S-TS KD-TREE-P

150 120 90 60 30 0 1

2 4 8 16 p (Number of MPI processes)

32

However, the reconstruction of the mesh is not parallelized in DM-S-TS implementation. So, the sequential part makes the reduction of execution time less significant when more CPUs are used. However, as the execution time of find

334

J. J. Águila et al.

neighbors is increased (e.g. in larger times series data), this circumstance becomes very less important. For Lorenz attractor, the DM-S-TS implementation is around 1.8 faster than the sequential program when it uses 2 CPUs, and around 12 when it uses 32 CPUs. This means that the efficiency for 2 CPUs is around 92% and decreases to 37% when using 32 CPUs. For ECG signal, the best box-assisted parallel implementation achieves a speed-up of around 16 when it is run on 32 CPUs of GALGO. Moreover, the time saving is around 93% using 2 CPUs and 51% using 32 CPUs. Unlike previous case, the efficiency of best implementation decreases more slowly. An optimization of TISEAN has been used. It allows the best mesh size to be tuned for each case. In case of use original TISEAN (fixed mesh size), the reduction of execution time would be more important. According to the experimental results, kd-tree-based parallel implementation obtains the best performance than the box-assisted-based parallel implementation, almost in terms of execution time, for both case studies. Due to the spectacular execution time reduction provided by the kd-tree-based parallel implementation, the performance in terms of speed-up and efficiency seems to be worst, with respect to the other approaches.

27.4 Conclusions In this paper, a comparative study between the distributed memory implementations of two different ways to compute the FNN method have been presented, that is, the based on the box-assisted algorithm and the based on kd-tree data structure. To make this comparative study three different implementations have been developed: two implementations based on box-assisted algorithm, and one implementation based on kd-tree data structure. The most important metric to consider is how well the resulting implementations accelerate the compute of the minimal embedding dimension, which is the ultimate goal of the FNN method. In terms of the execution time, the parallel approaches are from 2 to 16 times faster than the sequential implementation, and the kd-tree approach is from 3 to 7 times faster than the box-assisted algorithm. With respect to the experimental results, the kd-tree-based parallel implementation provides the best performance in terms of execution time, reducing dramatically the execution time. As a consequence, the speed-up an efficiency are far from the ideal. However, it is necessary to deal with more case studies of special interest for the authors: wind speed, ozone, air temperature, etc. About related works, in the context of parallel implementations to compute FNN method, the work carried out by the authors could be considered as the first one. The authors are working also on considering shared memory implementations using Pthreads [18, 19] or OpenMP [20, 21], and hybrid MPI+Pthreads or MPI+OpenMP parallel implementations. Also, as a future work, the authors are considering to develop GPU-based parallel implementation of the algorithms considered in this paper.

27

High Performance Computing Applied to FNN Method

335

To sum-up, we hope that our program will be useful in applications of nonlinear techniques to analyze real time series as well as artificial time series. This work represents the first step of nonlinear time series analysis, that it is becomes meaningful when considering ulterior stages on the analysis as prediction, and when for some applications the time represents a crucial factor. Acknowledgments This work has been supported by National Projects CGL2007-66440-C04-03 and CGL2008-05688-C02-01/CLI. A short version was presented in [22]. In this version, we have introduced the algorithmic notation by the parallel implementations.

References 1. Kennel MB, Brown R, Abarbanel HDI (1992) Determining embedding dimension for phase space reconstruction using the method of false nearest neighbors. Phys Rev A 45(6): 3403–3411 2. Fraser AM, Swinney HL (1986) Independent coordinates for strange attractors from mutual information. Phys Rev A 33(2):1134–1140 3. Takens F (1981) Detecting strange attractors in turbulence. In: Rand DA, Young L-S (eds) Dynamical systems and turbulence, Warwick 1980. Springer, New York, pp 366–381 4. Schreiber T (1995) Efficient neighbor searching in nonlinear time series analysis. Int J Bifurcation Chaos 5:349 5. Grassberger P (1990) An optimized box-assisted algorithm for fractal dimensions. Phys Lett A 148(1–2):63–68 6. Bentley JL (1975) Multidimensional binary search trees used for associative searching. Commun ACM 18(9):509–517 7. Friedman JH, Bentley JL, Finkel RA (1977) An algorithm for finding best matches in logarithmic expected time. ACM Trans Math Software (TOMS) 3(3):209–226 8. Hegger R, Kantz H, Schreiber T (1999) Practical implementation of nonlinear time series methods: the TISEAN package. Chaos 9(2):413–435 9. Hegger R, Kantz H, Schreiber T (2007) Tisean: nonlinear time series analysis. http://www.mpipks-dresden.mpg.de/*tisean 10. Kennel MB (1993) Download page of fnn program ftp://lyapunov.ucsd.edu/pub/nonlinear/ fns.tgz 11. Darema F (2001) The spmd model: past, present and future. In: Lecture notes in computer science, pp 1–1 12. Grama A, Gupta A, Karypis G, Kumar V (2003) Introduction to parallel computing. AddisonWesley, New York 13. Message Passing Interface http://www.mcs.anl.gov/research/projects/mpi 14. Lorenz EN (1963) Deterministic nonperiodic flow. J Atmos Sci 20(2):130–141 15. McSharry PE, Clifford GD, Tarassenko L, Smith LA (2003) A dynamical model for generating synthetic electrocardiogram signals. IEEE Trans Biomedical Eng 50(3):289–294 16. ECGSYN (2003) Ecgsyn: a realistic ecg waveform generator. http://www.physionet.org/ physiotools/ecgsyn 17. Albacete Research Institute of Informatics, http://www.i3a.uclm.es 18. Mueller F (1999) Pthreads library interface. Institut fur Informatik 19. Wagner T, Towsley D (1995) Getting started with POSIX threads. Department of Computer Science, University of Massachusetts 20. Dagum L (1997) Open MP: a proposed industry standard API for shared memory programming. OpenMP.org

336

J. J. Águila et al.

21. Dagum L, Menon R (1998) Open MP: an industry-standard API for shared-memory programming. IEEE Comput Sci Eng 5:46–55 22. Águila JJ, Marín I, Arias E, Artigao MM, Miralles JJ (2010) Distributed memory implementation of the false nearest neighbors method: kd-tree approach versus box-assisted approach. In: Lecture notes in engineering and computer science: proceedings of the World Congress on engineering 2010, WCE 2010, 30 June–2 July, London, UK, pp 493–498

Chapter 28

Ethernet Based Implementation of a Periodic Real Time Distributed System Sahraoui Zakaria, Labed Abdennour and Serir Aomar

Abstract This work presents the realization of a platform for testing and validating distributed real time systems (DRTS), by following a methodology of development. Our main contribution remains the realization of an industrial communication bus (FIP: Factory Instrumentation Protocol) implemented on an Ethernet platform. It focuses on improving the response time of the bus. For that, we use a deterministic implementation of FIP’s services (variables identification, transmission functions, …) by exploiting the TCP/IP stack. The periodic communications are monitored by real time periodic threads, run on RTAI kernel.

28.1 Introduction Real time and distributed industrial systems development often rests on an appropriate methodology. Their implementation may be based on using a programming language or on using a fast prototyping tool that involves simulators, code generators and hardware in the loop. So, the design follows a model, architecture and a language appropriate to the applied methodology. The validation step of such systems in particular, requires platforms and middlewares to distribute the controls, the computation or the data. These platforms insure the management of the field buses. S. Zakaria (&) L. Abdennour S. Aomar Computer Science Department, EMP, BP 17 BEB, Algiers, Algeria e-mail: [email protected] L. Abdennour e-mail: [email protected] S. Aomar e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_28, Springer Science+Business Media B.V. 2011

337

338

S. Zakaria et al.

Distributed real time applications must satisfy two conditions to communicate data: determinism and reliability. Conventionally, industrial local area networks or any networks in hostile environment (engine of a vehicle) use field buses as the controller area network (CAN) and the FIP buses to fit these two requirements. The FIP field bus is a platform offering a configuring interface allowing a station1 to take place in the FIP network. Hence, it can produce and consume periodic or aperiodic variables, send and receive messages with or without connection or assure the arbitration function of the bus. The arbitrator holds the list of variables which are created on configuration. For each variable the producer and its periodicity of use on the bus are defined [1, 2]. Industrial networks tend to exploit the possibilities of the ETHERNET, which has been proven to be well adapted since it has a lower performance/cost ratio. In the present work we describe a platform that we have developed, to test the DRTS. We have adopted the methodology proposed by Benaissa and Serir in [3]. It is an approach of development based on a descendent hierarchical functional decomposition of a control system. The primitive processes of the system are specified by elementary grafcets that communicate and synchronize using messages. The Grafcets processes may refer to components of the same site or of different sites (distributed). The link between sites is provided by a communication system similar to the FIP field bus with a deterministic, periodic and synchronous behavior. Our contribution concerns a FIP configuration on ETHERNET. So, we have projected the FIP bus architecture (protocols and services) on its counterpart, the Ethernet communication system together with TCP/IP. This aims at checking the specification of the resulting FIP bus to fit all the mechanisms of FIP: the use of broadcasting in the communication system, acknowledgment of variables identifiers by the different sites and errors verification services (physical, link and application layers of OSI). Consequently, the validation will rest on performance of the real time micro kernel RTAI2 we have used. We will analyze the TCP/IP/Ethernet model in Sect. 28.2. In Sect. 28.3 we describe the approach of design of the proposed FIP field bus. In Sect. 28.4, we implement an example of a distributed real time application specified according to the methodology. Section 28.5 resumes and discusses the results of the tests. We conclude in Sect. 28.6.

28.2 Related Works Data exchange between and within field buses requires high rate communications. However, the majority of the field buses such as WorldFip, FF, Profibus, P-Net and CAN suggest insufficient rates (31.25–5 Mbps). To overcome this problem, 1

Part of the bus that can be, a computer, an automaton, a sensor or an actuator. Open source, preemptible without latency problems of operating system calls and has a known gigue. 2

28

Ethernet Based Implementation

339

high rate networks as FDDI and ATM were proposed and their capability to support strict time constraints and soft real time has been evaluated. However, these solutions didn’t meet expectations because of their high costs and complexity of implementation [4]. Some improvements have been operated on the Ethernet to help it support communications constrained by the time. So, the Ethernet protocol is modified or a deterministic layer is implemented over the MAC sublayer. One solution consists in using the TDMA strategy. But it has the drawback to waste the time slots of idle stations (no transmissions). As examples, we can cite the P-CSMA (Prioritized CSMA), RTNET of RTAI micro kernel. The PCSMA (Predictable CSMA) technique is data scheduling oriented where all real time data are assumed to be periodic. Though it avoids time waste, it has an overhead in its off line scheduling. We can also find techniques based on the modification of the binary exponential backoff (BEB) algorithm, like CSMA/DCR (CSMA deterministic collision avoidance) which uses a binary tree research instead of the non deterministic BEB [5]. Indeed, such techniques may support strict as well as soft real time applications by changing the basic structure of the Ethernet. Moreover, adding a deterministic layer upon MAC, may lead to the same result. Among these solutions, we have the Virtual Time protocol (VTP), the Window Protocol (WP) and the traffic smoothing (TS) [4]. Middleware-based protocols of communication have been recently proposed for applications in automation. They are implemented either on TCP, like Modbus TCP and ProfiNet or on UDP, such as NDDS (Network Data Delivery Service) [6, 7]. We end the list of technical solutions by Avionics Full DupleX switched Ethernet base 100 TX (AFDX).

28.3 Proposed Approach (FIP over ETHERNET/IP) The platform design is the result of Ethernet/TCP/IP model and its counterpart FIP’s layers analysis.

28.3.1 Analysis of the Physical Layer The WorldFIP norm defines three transmission rates: 31,25 kbit/s with a bit transmission time of Tbit = 32 ls, 1 Mbit/s with Tbit = 1 ls and 2,5 Mbit/s with Tbit = 400 ns [1]. So, the Ethernet rate varies between 10 Mbit/s with Tbit = 512 ls and 10 Gbit/s with Tbit = 512 ns [8]. The Manchester II coding is used for Ethernet with 10 Mbit/s and for the WorldFIP. For instance, Ethernet 100BaseTx uses the 4B/5B coding which limits the Tbit to 8 ms [8]. Note that the effects of noise on the FIP bus are similar to those on Ethernet categories which use short Tbit and low modulation speed.

340

S. Zakaria et al.

Fig. 28.1 Computation of the delay for the periodic traffic

Fig. 28.2 Link layer frames’ format of WorldFIP (CENELEC EN61158-2 norm of FIP) [1]

28.3.2 Analysis of the Data Link Layer The first step consists in analyzing the protocols ARP3 (address resolution protocol) and LLC (logical layer control) of TCP/IP and the services that may affect the traffic, in order to preserve a deterministic behavior. For our experiments, we used a LAN with one HUB and two PCs on which we installed a linux system provided with a traffic analyzer (Ethereal). For the Ethernet traffic capture, we noticed that in absence of traffic, three queries may occur on the network: two LLC control queries (initialization), two ARP with the sender address (generated by default every 180 s) and active services queries. Furthermore, when we replaced the Hub by a switch, we noticed that it also uses the LLC protocol to initialize the network (SSAP query: Spanning tree BPDU4 command with a forward delay of 15 ns). To avoid non-determinism of the protocol, it suffices to fix the duration BaseReachable-Time at an unreachable value. But, wasted time to ensure a deterministic emission or reception (periodic queries of identification, every 180 s) can be bounded by (n + 1) Tra. In this relation, n is the ratio between duration of an emission and the period of an ARP query and Tra is the transmission time of an ARP query (Fig. 28.1).

28.3.3 Benefit of the Data Link Layer The FIP frame format on the link level (Fig. 28.2) involves a control byte to code the frame’s type (ID_DAT, RP_DAT, ID_RQI, RP_ACK …), data bytes (128 bytes

3

ARP: protocol of layer three which makes the correspondence between Internet logical addresses and MAC addresses. 4 Used by switches and routers to avoid loops on a WAN.

28

Ethernet Based Implementation

341

Fig. 28.3 Ethernet II frame

Fig. 28.4 The Ethernet IEEE 802.3 frame

or 256 octets) and two bytes allowing a receiver to check the integrity of the received frame [1]. Source and destination MAC addresses of Ethernet frames [5] (Figs. 28.3 and 28.4) have no role in the specification, since sites’ identification in FIP bus has no interest at this level. Using Ethernet CSMA/CD protocol, transmission errors are not detected through the absence of acknowledgment, but through interference. In FIP implementation over ETHERNET, the temporization is assured implicitly. However the frames (identifier, variables response, query response …) are processed at transport and application layers. Error control is required by the CRC for both FIP and Ethernet using the same code.

28.3.4 Benefit of the Network Layer (IP) In the OSI standard, the TCP/IP protocol offers routing operations. So, interconnection between any pair of machines is possible. But in FIP network, sites identification is implicitly provided by the identifiers of variables to be transmitted. Recall that client sites of FIP system are synchronized only by variables identifiers and IP address will not give information on variables identifiers. Consequently, in our design, sites participating to the exchange of a variable are implicitly identified as producer of this variable via the broadcasting principle. The use of HUBs offers implicitly the broadcasting possibility. But, if we use switches, an appropriate configuration is necessary (configure ports on promiscuous mode).

28.3.5 Benefit of the UDP Layer UDP protocol has been the unique solution for many tools of real time applications implementation. The nature of UDP datagrammes is ideal for sending fragments of data generated by such applications. It is selected for the speed of communication between its clients. It uses a simplified structure of the header, which restricts to the fields shown in Fig. 28.5. The checksum of the header is computed as for IP packets.

342 Fig. 28.5 The UDP datagramme or message format

S. Zakaria et al. 0

7 Source Port

15

23

31 Destination Port

Length

Checksum Data

Fig. 28.6 Architecture of the adopted transport level

Arbitrator table A

Consumer table C

28.4 Adopted Architecture for the Transport Level Architecture of Fig. 28.6 shows the solution we have chosen among much other architecture. Use of tables P and C to structure producer and consumer variables, simplifies managing variables at the processing step. Different port numbers are used for the arbitrator and the producer–consumer sites, to separate messages intended to identify variables and those which contain variables. Using a unique port instead of more than one at each producer/consumer site allows synchronous design of production and consumption functions. A producer site receives identifiers ID_DAT of the variables to be produced. The consumer detects the arrival of these frames in order to enable an internal temporizer. If this temporizer expires, the station considers the next frame only if it has the emitting port of the arbitrator. Recall that messages exchange on the FIP bus is done in point to point or in multipoint on the same segment. Two addresses of 24 bits (source and destination) allow coding the number of the segment of the application entity and its address on this segment. Hence, IP addressing may be used to perform these transactions.

28.5 Maximal Transferring Time (Critical Time) Real time and distributed applications impose temporal constraints tasks achievement; these constraints will have direct impact on the exchanged messages between tasks located on different processors. In real time applications, tasks may have or not temporal constraints as well as the exchanged messages between them. As indicated on Fig. 28.7, the transferring time of a message is composed of several intermediate times which are summarized in Table 28.1. If we note Dt as the duration of a transaction, then:

28

Ethernet Based Implementation

Fig. 28.7 Transmission times

343

FIP Task

FIP Task

TCP / IP

TCP / IP

Layers s

Layers

MAC

MAC

Sub layer

Sub layer Medium

Transmission and Propagation time

Table 28.1 Notation used for transmission times Identifiers transmission time Notation Variables transmission time Identifier sending (FIP arbitrator task) Latest Sendto(socket,UDP…) MAC emission Transmission on the medium MAC reception Recvfrm(Socket, UDP…) function Acknowledgement (FIP producer task)

Tta Tst Tsm Tp Trm Trt Trp

Variable sending time (FIP production task) Latest Sendto(socket,UDP…) MAC emission time Transmission on the medium MAC reception time Recvfrm(Socket, UDP…) function Checking (FIP arbitrator task)

Dt ¼ Tta þ 2 Tst þ 2 Tsm þ 2 Tp þ 2 Trm þ 2 Trt þ 2 Trp þ Ttp. . .

Notation Ttp Tst Tsm Tp Trm Trt Trp

ð28:1Þ

Given the fact that in our protocol, service is carried out by sources of indeterminism, which are all periodic, the maximal time of periodic transaction is computed as: Dt Tta þ 2Tst þ 2Tsm þ 2Tp þ 2Trm þ 2Trt þ 2 Trp þ Ttp þ ðn þ 1Þ Tra. . . ð28:2Þ

28.6 Implementation of the Communication System 28.6.1 Hardware and Software of the Platform Each component of our case materialized by a PC provided with an Ethernet network interface controller 100BaseT, is considered as a site. On each site the RTAI system is installed. We have used four: three are Pentium 4 with CPU of

344

S. Zakaria et al.

Fig. 28.8 Modified macrocycle

2.39 GHz; the fourth is a Pentium 2 with CPU of 233 MHz. The first machine plays the role of arbitrator, the second and the fourth the role of producers– consumers (site 1 and site 2). The third has a monitoring role by analyzing traffic using the Ethereal tool.

28.6.2 Our Arbitration Function Variables’ identifiers are scheduled on a macrocycle as follows: • associate a task to each identifier; • each task must be triggered periodically in the macrocycle at a precise time of the elementary cycle to broadcast the identifier; • arbitrator executes tasks thanks to a preemptive scheduler with priority. A task is elected by scheduler only if all its preceding tasks have been achieved; • remaining time after execution of all the tasks in a microcycle will be used for aperiodic exchange; • awakening date of a periodic task i is computed by taking into account the total transfer time of all the previous transactions of a microcycle (Fig. 28.8).

28.6.3 The Producer–Consumer Function We will specify production and consumption tasks of a site. The first task to be executed is the production function, because, the client site have first of all to wait for an eventual identifier of a variable using the primitive (recvfrom). Then, the producer task scrutinizes its table to check if it is concerned by the variable associated to this identifier in which case it broadcasts the variable. If the site is not the producer, the same task scrutinizes its consumption table to check if it has to receive the variable on the same port using the same primitive. On the other hand, in the consumption processing, the task enables an internal temporizer to confirm the frame loss at expiration of this temporizer and assure the global order of the system.

28

Ethernet Based Implementation

345

28.6.4 Use of Real Time FIFO Mechanism Technically, it was not possible to compile a new Ethernet network driver over RTAI. So, ETHERNET of Linux system is used via the mechanism of real time queues (rt_FIFO), to communicate between ordinary processes and RTAI real time processes. The arbitrator creates two rt_FIFOs for its services; the reason is that, the primitive (rtf_get) used to read variables will use another mechanism of asynchronous nature. This primitive has been put in a function that we have called monitoring.

28.6.5 Monitoring Function This function focuses on the variables exchange via real time queues and computes the transaction time. It is automatically enabled by the arrival of a variable in the queue (linux process has inserted the variable in the queue). This mechanism is assured by rtf_create_handler(fifo, monitoring_func) primitive. Hence, unlike the FIP bus variables, the rt_FIFOs’ buffer has no refreshment and promptitude problem.

28.6.6 Schedulability In our implementation we used an arbitrator which involves a set of RTAI periodic tasks and another function for the producer–consumer site. The latter is sequentially executed and respects the order of a FIP transaction. The fact that arbitrator tasks are periodic makes it possible to apply the schedulability criterion of formula (28.3) and to compute maximal times of transactions execution. i¼n X

DTi =PTi 1. . .

ð28:3Þ

i¼1

DTi: ith FIP periodic transaction time, PTi: ith FIP period transaction.

28.7 A Case Study In this example, the application is composed of two distributed grafcets in the communication system (Fig. 28.9).

346

S. Zakaria et al.

Fig. 28.9 Example of distributed grafcets

Fig. 28.10 Arbitrator table

Communication between the two grafcets requires transmission of input a(m1) and state X21(m2) of site 2 to site 1 and transmission of state X20(m3) from site 1 to site 2, for each period of the macrocycle. Message m1 is transmitted during the first elementary cycle. Messages m2 and m3 are transmitted during the second one in the order m2, m3. To assure a coherent arbitration, we have bounded the time of a transaction, and consequently, the time to be added to the periods of variables. So, we compute the values resulting from subtractions between the clock value read after every sending and the corresponding value of the clock, sent by the monitoring function. Then, the upper bound is the maximum of the obtained results (Fig. 28.10).

28.7.1 Experimental Results To compare our solution to the original FIP, we have considered the parameters: time of transactions and the duration of production function. Since we have initialized the period, in timer tick, (periodic mode) to 119 ticks or 100,000 ns, a time value in tick of the clock is converted to nanoseconds by multiplying it by 840.336.

28

Ethernet Based Implementation

347

Fig. 28.11 Interpretation result

Table 28.2 Transaction times in milliseconds Test 1 Test 2

Test 3

Test 4

Average Maximum Minimum

0.700 1.2 0.1

0.322 0.5 0.1

0.460 1.2 0.3

0.600 1.2 0.3

We give bellow two sets of results (Fig. 28.11), corresponding respectively to a setup with a 10BaseT HUB and with a 100BaseT switch. Notice that we have used cables of UTP5 category. We have estimated the time used by a producer to produce the frame response of variable mi and the associated propagation time. Values sent by the identifier sending task and those sent by the monitoring function are given in tick of clock. The results of subtractions between the durations are converted into seconds.

28.7.2 Discussions Table 28.2, gives an idea about some measured times. It is obvious that transaction times may be lowered using a switch. The maximal values are almost equivalent for all the tests and vary between 0.5 and 1.2 ms. These results are due to the fact that the production times are often less than half of transaction times, which explains the slowness and inderminism of emission and reception function of linux arbitrator.

348

S. Zakaria et al.

Table 28.3 Sample of the original FIP scan speed [3] Scanned variables Variable size (bytes) Scanned variables

Variable size (bytes)

320 304 277 235

16 32 64 128

1 2 4 8

181 123 75 42

Fig. 28.12 Comparison of scrutation speeds

For example, if we take the value 1.2 ms which represents time of the seventh transaction of test 3, and the value 0.047 ms the time of its production. We notice that delay is due to emission and reception functions of the arbitrator (Fig. 28.11). Another example, concerns the maximal value obtained in test 1, it corresponds to the transaction of variable m3. This value of 0,979 ms gives the production time of the variable by the slowest machine (site 2).

28.7.3 Speed of Variables Scrutation Note that FIP Network at 2.5 Mbps, with a reversal time of 10 ms, in an element cycle of 20 ms, we can scrutinize (Table 28.3). From Table 28.2, we can get the interval of variation for arbitrator scan speed (Fig. 28.12). We can notice that for the switch, the scan speed may reach 200 variables per 20 ms. However, for the FIP this speed is reached if the size of variables decreases from 16 to 8 bytes.

28.7.4 Useful Bit Rate The useful bit rate is the ratio of the effective information and the duration of a transaction. For variables of size L: 1 \ L \ 16 bytes, a MAC frame has always a size of 64 bytes. But, if the variables’ size L is greater than 16 bytes, the size of the MAC frame will vary between 64 and 1500 bytes. Nevertheless, transmitted

28

Ethernet Based Implementation

349

Table 28.4 Comparison between the standard FIP and implemented FIP Useful data Transmitted data Transaction Useful bit rate (byte) (byte) time (ms) (Kbps) FIP

Implemented FIP

4 8 16 32 64 128 4 8 16 32

19 or (16 ? 4) 23 31 47 79 143 64 64 64 79 or (47 ? 32)

0.072 0.084799 0.013800 0.020200 0.033000 0.058600 0.099999 0.099999 0.099999 0.099999

444.44 754.72 1159.42 1584.16 1939.39 2184.30 5120 5120 5120 6320

Efficiency (%) 17.77 19.27 32.32 48.85 65.64 79.25 05.12 05.12 05.12 06.32

information is segmented into frames of 1500 bytes if the variables’ size exceeds 1453 bytes.

28.7.5 Efficiency We now compute the efficiency of our solution as the ratio between emission time of effective information and the duration of the transaction. It is equivalent to the useful bit rate and the transmission rate ratio. Table 28.4 compares the FIP’s [2] efficiency and that of our implementation. To complete the table, we deduce FIP transaction time from the ratio: length of useful information and useful bit rate. To compute the useful bit rate, we take the global minimum of all the durations (tests of previous example). The tests on the implemented FIP, with a variable of 32 bytes gave the same minimums. This table gives an idea about the margin that we have on the size of data that we can transmit in a transaction. Hence, efficiency is increased if we add other services (aperiodic variable exchange).

28.8 Conclusions and Future Work We have compared the results obtained for the distributed grafcets and the standard FIP on a practical example. The comparison was concerned with the scanning speed of arbitration table and the computation of the communication system efficiency in its cyclic part. Results obtained using switched Ethernet (switches) show that arbitrator of the implemented FIP can scrutinize its arbitration table faster than the standard FIP.

350

S. Zakaria et al.

The implemented platform confirms the goodness of the distributed grafcet model. It constitutes by itself a new design for implementation of distributed real time systems which can be qualified as distributed systems for field data base management.

References 1. WorldFIP tools FIPdesigner hlp technologies (2000) L M 2 - C N F - 2 - 0 0 1 – D, 12 Jul 2. WorldFip Protocole (1999) European standard, En 50170. http://www.WorldFIP.org 3. Benaissa M (2004) GRAFCET based formalism for design of distributed real time systems. Master Thesis, EMP Bordj-El-Bahri, Algiers, Algeria (in French) 4. Wang Z, Song YQ, Chen JM, Sun YX (2001) Real time characteristics of Ethernet and its improvement. In: Proceeding of the world congress on intelligent control and automation, June 5. Pujolle G (1997) Networks, 2nd edn. Eyrolles (in French) 6. Venkatramani C, Chiueh T, Supporting real-time traffic on Ethernet. In: 1052-8725/94 IEEE 7. Doléjs O, Hanzalék Z (2003) Simulation of Ethernet for real time applications. In: IEEE, ICIT—Maribor, Slovenia 8. Telecommunication and Networks, Claude Sevin, Dunod 2006 (in French)

Chapter 29

Preliminary Analysis of Flexible Pavement Performance Data Using Linear Mixed Effects Models Hsiang-Wei Ker and Ying-Haur Lee

Abstract Multilevel data are very common in many fields. Because of its hierarchical data structure, multilevel data are often analyzed using Linear MixedEffects (LME) models. The exploratory analysis, statistical modeling, and the examination of model-fit of LME models are more complicated than those of standard multiple regressions. A systematic modeling approach using visualgraphical techniques and LME models was proposed and demonstrated using the original AASHO road test flexible pavement data. The proposed approach including exploring the growth patterns at both group and individual levels, identifying the important predictors and unusual subjects, choosing suitable statistical models, selecting a preliminary mean structure, selecting a random structure, selecting a residual covariance structure, model reduction, and the examination of the model fit was further discussed.

29.1 Introduction Longitudinal data are used in the research on growth, development, and change. Such data consist of measurements on the same subjects repeatedly over time. To describe the pattern of individual growth, make predictions, and gain more insight

H.-W. Ker (&) Department of International Trade, Chihlee Institute of Technology, Taipei, 220, Taiwan e-mail: [email protected] Y.-H. Lee Department of Civil Engineering, Tamkang University, Taipei, 251, Taiwan e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_29, Springer Science+Business Media B.V. 2011

351

352

H.-W. Ker and Y.-H. Lee

into the underlying causal relationships related to developmental pattern requires studying the structure of measurements taken on different occasions [1]. Multivariate analysis of variance (MANOVA), repeated measures ANOVA, and standard multiple regression methods have been the most widely used tools for analyzing longitudinal data. Polynomial functions are usually employed to model individual growth patterns. Classical longitudinal data analysis relies on balanced designs where each individual is measured at the same time (i.e., no missing observations). MANOVA, which imposes no constraints on residual covariance matrix, is one common approach in analyzing longitudinal data. However, an unconstrained residual covariance structure is not efficient if the residual errors indeed possess a certain structure, especially when this structure is often of interest in longitudinal studies. Repeated measures ANOVA have the assumption of sphericity. It is too restrictive for longitudinal data because such data often exhibit larger correlations between nearby measurement than between measurements that are far apart. The variance and covariance of the within-subject errors also vary over time. The sphericity assumption is inappropriate in longitudinal studies if residual errors exhibit heterogeneity and dependence. In longitudinal studies, the focus is on determining whether subjects respond differently under different treatment conditions or at different time points. The errors in longitudinal data often exhibit heterogeneity and dependence, which call for structured covariance models. Longitudinal data typically possess a hierarchical structure that the repeated measurements are nested within an individual. While the repeated measures are the first level, the individual is the second-level unit and groups of individuals are higher level units [2]. Traditional regression analysis and repeated measures ANOVA fail to deal with these two major characteristics of longitudinal data. Linear Mixed-Effects (LME) models are an alternative for analyzing longitudinal data. These models can be applied to data where the number and the spacing of occasions vary across individuals and the number of occasions is large. LME models can also be used for continuous time. LME models are more flexible than MANOVA in that they do not require an equal number of occasions for all individuals or even the same occasions. Moreover, varied covariance structures can be imposed on the residuals based on the nature of the data. Thus, LME models are well suited for longitudinal data that have variable occasion time, unbalanced data structure, and constrained covariance model for residual errors. A systematic modeling approach using visual–graphical techniques and LME models was proposed and demonstrated using the original AASHO road test flexible pavement data [3]. The proposed approach including characterizing the growth patterns at both group and individual levels, identifying the important predictors and unusual subjects, choosing suitable statistical models, selecting random-effects structures, suggesting possible residuals covariance models, and examining the model-fits will be further discussed [4–7].

29

Preliminary Analysis of Flexible Pavement Performance Data

353

29.2 Methods Hierarchical linear models allow researchers to analyze hierarchically nested data with two or more levels. A two-level hierarchical linear model consists of two submodels: individual-level (level-1) and group-level (level-2). The parameters in a group-level model specify the unknown distribution of individual-level parameters. The intercept and slopes at individual-level can be specified as random. Substituting the level-2 equations for the slopes into the level-1 model yields a linear mixed-effects (LME) model. LME models are mixed-effects models in which both fixed and random effects occur linearly in the model function [8]. In a typical hierarchical linear model, the individual is the level-1 unit in the hierarchy. An individual has a series of measurements at different time points in longitudinal studies [9]. When modeling longitudinal data, the repeated measurements are the level-1 units (i.e., a separate level below individuals). The individual is the second-level unit, and more levels can be added for possible group structures [2]. The basic model at the lowest level, also regarded as repeated-measures level, for the application of hierarchical linear model in longitudinal data can be formulated as: Level - 1: Ytj ¼ b0j þ b1j ctj þ b2j xtj þ rtj

ð29:1Þ

where Ytj is the measure for an individual j at time t, ctj is the time variable indicating the time of measurement for this individual, xtj is the time-varying covariate, and rtj is the residual error term. b0j ¼ c00 þ c01 Wj1 þ u0j b Level - 2: 1j ¼ c10 þ c11 Wj1 þ u1j b2j ¼ c20 þ c21 Wj1 þ u2j

ð29:2Þ

In this level-2 equation, W is the time-invariant covariate for this individual. After substituting level-2 equations into level-1, the combined or the linear mixed-effects model is: Ytj ¼ ½c00 þ c10 ctj þ c20 xtj þ c01 Wj1 þ c11 Wj1 ctj þ c21 Wj1 xtj þ ½u0j þ u1j ctj þ u2j xtj þ rtj

ð29:3Þ

The level-1 model is a within-individuals model and the level-2 model is a between-individuals model [10]. Note that there is no time-invariant covariate in level-2 before introducing the variable W. The variance and covariance of the u’s are the variances and covariances of the random intercept and slopes. After introducing the variable W, the variance and the covariance of u’s are the variance and covariance of residual intercept and slopes after partitioning out the variable W. More time-invariant variables can be added sequentially into level-2 to get different models. The reduction in variance of u’s could provide an estimate of variance in intercepts and slopes accounted for by those W’s [11]. This linear mixed-effects model does not require that every individual must have the same

354

H.-W. Ker and Y.-H. Lee

number of observations because of possible withdrawal from study or data transmission errors. Let Ytj denotes the tth measurement on the jth individual, in which t = 1, 2, …, ni measurements for subject j, and j = 1, 2, …, N individuals. The vector Yj is the collection of the observations for the jth individual. A general linear mixed-effects model for individual j in longitudinal analysis can be formulated as: Y j ¼ X j b þ Z j U j þ Rj

ð29:4Þ

where Xj is an (nj 9 p) design matrix for the fixed effects; and b is a (p 9 1) vector of fixed-effect parameters. Zj is an (nj 9 r) design matrix for the random effects; and Uj is an (r 9 1) vector of random-effect parameters assumed to be independently distributed across individuals with a normal distribution, Uj * NID(0, T). The Uj vector captures the subject-specific mean effects as well as reflects the extra variability in the data. Rj is an (nj 9 1) vector for the residuals. The within errors, Rj, are assumed normally distributed with mean zero and variance r2Wj, where Wj (stands for ‘‘within’’) is a covariance matrix with a scale factor r2. The matrix Wj can be parameterized by using a few parameters and assumed to have various forms, e.g., an identity matrix or the first-order of autoregression or moving-average process [12, 13]. They are independent from individual to individual and are independent of random effects, Uj. Other choices for variance–covariance structures that involve correlated withinsubject errors have been proposed. Using appropriate covariance structures can increase efficiency and produced valid standard errors. The choice among covariance depends upon data structures, subject-related theories and available computer packages. In some cases, heterogeneous error variances can be employed in the model because the variances in this model are allowed to increase or decrease with time. The assumption of common variance shared by all individuals is removed [12, 14]. LME models generally assume that level-1 residual errors are uncorrelated over time. This assumption is questionable for longitudinal data that have observations closely spaced in time. There typically exists dependence between adjacent observations. This is called serial correlation and it tends to diminish as the time between observations increases. Serial correlation is part of the error structure and if it is present, it must be part of the model for producing proper analysis [12]. If the dependent within-subject errors are permitted, the choice of the model to represent the dependence needs careful consideration. It would be preferable to incorporate as much individual-specific structure as possible before introducing a serial correlation structure into within-subject errors [15].

29.3 Data Description The AASHO road test was a large-scale highway research project conducted near Ottawa, Illinois from 1958 to 1960, and has had by far the largest impact on the history of pavement performance analysis. The test consisted of six loops,

29

Preliminary Analysis of Flexible Pavement Performance Data

355

numbered 1–6. Each loop was a segment of a four-lane divided highway and centerlines divided the pavements into inner and outer lanes, called lane 1 and lane 2. Pavement designs varied from section to section. All sections had been subjected to almost the same number of axle load applications on any given date. Performance data was collected based on the trend of the pavement serviceability index at 2-week interval. The last day of each 2-week period was called an ‘‘index day.’’ Index days were numbered sequentially from 1 (November 3, 1958) to 55 (November 30, 1960) [3, 7, 16]. Empirical relationships between pavement thickness, load magnitude, axle type, accumulated axle load applications, and performance trends for both flexible and rigid pavements were developed after the completion of the road test. Several combinations of certain rules, mathematical transformations, analyses of variance, graphs, and linear regression techniques were utilized in the modeling process to develop such empirical relationships. A load equivalence factor was then established to convert different configurations of load applications to standard 18-kip equivalent single-axle loads (ESAL). This ESAL concept has been adopted internationally since then. As pavement design evolves from traditional empirically based methods toward mechanistic-empirical, the ESAL concept used for traffic loads estimation is no longer adopted in the recommended MechanisticEmpirical Pavement Design Guide (MEPDG) [17], although many researchers have argued that it is urgently in need of reconsideration [3, 18, 19]. During the road test, it was found that the damage rate was relatively low in winter but was relatively high in spring for flexible pavements. Therefore, load applications were adjusted by ‘‘seasonal weighting function’’ such that a better ‘‘weighted’’ flexible pavement equation was developed. Lee [18] has pointed out that the error variance increases when the predicted number of weighted load repetitions (W) increases. To serve the needs of predicting pavement serviceability index (PSI) after certain load applications on a given section, it is not uncommon that engineers would rearrange the original flexible pavement equation into the following form: 0:4þ 10945:19 ½logðESALÞ9:36logðSNþ1Þþ0:2 ðSNþ1Þ PSI ¼ 4:2 2:7 10 ð29:5Þ SN ¼ 0:44D1 þ 0:14D2 þ 0:11D3 In which the regression statistics are: R2 = 0.212, SEE = 0.622, N = 1083 [18]. Note that PSI ranges from 0 to 5 (0–1 for very poor; 1–2 for poor; 2–3 for fair; 3–4 for good; and 4–5 for very good conditions). D1 is the surface thickness (in.); D2 is the base thickness (in.); D3 is the subbase thickness (in.).

29.4 Exploratory Analysis Exploratory analysis is a technique to visualize the patterns of data. It is detective work of exposing data patterns relative to research interests. Exploratory analysis of longitudinal data can serve to: (a) discover as much of the information regarding

356

H.-W. Ker and Y.-H. Lee

Fig. 29.1 Mean PSI for each subject (loop/lane) versus index day

subject

3.5 3.0 2.5

Mean PSI

4.0

loop1/lane1 loop2/lane1 loop3/lane2 loop6/lane2 loop2/lane2 loop3/lane1 loop4/lane1 loop5/lane1 loop5/lane2 loop4/lane2 loop6/lane1

0

20

40

60

Index Day

raw data as possible rather than simply summarize the data; (b) highlight mean and individual growth patterns which are of potential research interest; as well as (c) identify longitudinal patterns and unusual subjects. Hence plotting individual curves to carefully examine the data should be performed first before any formal curve fitting is carried out. For the nature of this flexible pavement data, the exploratory analysis includes exploring ‘‘growth’’ patterns and the patterns regarding experimental conditions.

29.4.1 Exploring ‘‘Growth’’ Patterns The first step, which is perhaps the best way to get a sense of a new data, is to visualize or plot the data. Most longitudinal data analyses address individual growth patterns over time. Thus, the first useful exploratory analysis is to plot the response variable against time including individual and overall mean profiles. Individual mean profiles, which summarize the aspects of response variable for each individual over time, can be used to examine the possibility of variations among individuals and to identify potential outliers. The overall mean profile summarizes some aspects of the response variable over time for all subjects and is helpful in identifying unusual time when significant differences arise. Figure 29.1 shows the lines connecting the dependent variable (mean PSI) over time for each subject (loop/lane). Most subjects have higher mean PSIs at the beginning of the observation period, and they tend to decrease over time. The spread among the subjects is substantially smaller at the beginning than that at the end. In addition, there exist noticeable variations among subjects. The overall mean growth curve over time indicates that the overall mean PSIs are larger at the beginning and decrease over time; and the rate of deterioration is higher at the beginning than that at the end.

29

Preliminary Analysis of Flexible Pavement Performance Data

357

29.4.2 Exploring the Patterns of Experimental Conditions In addition to time (in terms of index day), different major experimental conditions may be considered. This exploratory analysis is intended to discover the overall and individual patterns of each experimental condition and their interactions on mean PSIs. Subsequently, the patterns of mean PSIs for each subject and the patterns of overall mean PSIs on each experimental condition and their interactions over time are investigated [7]. Generally speaking, the mean PSIs for pavements with higher surface thickness are higher than those with thinner surface layer.

29.5 Linear-Mixed Effects Modeling Approach The following proposed modeling approach is generally applicable to modeling multilevel longitudinal data with a large number of time points. Model building procedures including the selection of a preliminary mean structure, the selection of a random structure, the selection of a residual covariance structure, model reduction, and the examination of the model fit are subsequently illustrated.

29.5.1 Selecting a Preliminary Mean Structure Covariance structures are used to model variation that cannot be explained by fixed effects and depend highly on the mean structures. The first step to model building is to remove the systematic part and remove this so that the variation can be examined. The dataset includes the following explanatory variables: thick, basethk, subasthk, uwtappl, FT. In which, thick is the surface thickness (in.); basethk is the base thickness (in.); subasthk is the subbase thickness (in.); uwtappl is the unweighted applications (millions), and FT is monthly the freeze–thaw cycles. A model containing all main effects, and all the two-way, three-way interaction terms was first investigated. This model (called model-1) has the form: PSIij ¼ b0j þ b1j ðthickÞij þ b2j ðbasethkÞij þ b3j ðsubasthkÞij þ b4j ðuwtapplÞij þ b5j ðuwtapplÞ2ij þ b6j ðFTÞ þ two-way interaction terms of thick, basethk, subasthk, and uwtappl þ three-way interaction terms of thick, basethk, subasthk, and uwtappl þ Rij ð29:6Þ

358

H.-W. Ker and Y.-H. Lee

29.5.2 Selecting a Preliminary Random Structure The second step is to select a set of random effects in the covariance structure. An appropriately specified covariance structure is helpful in interpreting the random variation in the data, achieving the efficiency of estimation, as well as obtaining valid inferences of the parameters in the mean structure of the model. In longitudinal studies, the same subject is repeatedly measured over time. The data collected from longitudinal study is a collection of correlated data. The within-subject errors are often heteroscedastic (i.e., having unequal variance), correlated, or both.

29.5.2.1 Exploring Preliminary Random-Effects Structure A useful tool to explore the random-effects structure is to remove the mean structure from the data and use ordinary least square (OLS) residuals to check the need for a linear mixed-effects model and decide which time-varying covariate should be included in the random structure. The boxplot of residuals by subject corresponding to the fit of a single linear regression by using the same form of the preliminary level-1 model was conducted. This is the case when grouping structure is ignored from the hierarchy of data. Since the residuals are not centered around zero, there are considerable differences in the magnitudes of residuals among subjects. This indicates the need for subject effects, which is precisely the motivation for using linear mixed-effects model. Separate linear regression models were employed to fit each subject to explore the potential linear relationship. To assist in selecting a set of random effects to be included in the covariance model, the plots of mean raw residuals against time and the variance of residuals against time are useful. If only random-intercepts models hold, the residual has the form, eij ¼ U0j þ Rij , in which U0j is the random effect for intercepts and Rij is the level-1 error. If this plot shows constant variability over time or the curves are flat, then only random intercept model is needed. If random-intercepts-and-slopes models hold, the residual has the form, eij ¼ U0j þ U1j x1ij þ þ Uqj xqij þ Rij , where Uqj is the random effect for the qth slope. In the case of random-interceptsand-slopes model, the plot would show the variability varies over time or there are some unexplained systematic structures in the model. One or more random effects, additional to random intercept, have to be added.

29.5.2.2 Selecting a Variance–Covariance Structure for Random Effects Three possible variance–covariance structures including general positive definite (unstructured), diagonal, and block-diagonal based on different assumptions [8] were investigated. General positive-definite is a general covariance matrix

29

Preliminary Analysis of Flexible Pavement Performance Data

359

Table 29.1 Model comparison using three variance–covariance structures Model df AIC BIC logLik Test L. ratio

p-Value

(1) Unstr (2) Diag (3) Bk-diag

\0.0001 0.0177

29 22 21

12910.29 13056.52 13060.14

13117.74 13213.90 13210.37

-6426.14 -6506.26 -6509.07

1 vs 2 2 vs 3

160.234 5.621

parameterized directly in terms of variances and covariances. Diagonal covariance structure is used when random-effects are assumed independent. Block-diagonal matrix is employed when it is assumed that different sets of random effects have different variances. Table 29.1 displays the model comparison of these three models. The unstructured model has the smallest absolute value of log-likelihood among them. The likelihood ratio test for unstructured model versus diagonal model is 160.23 with p-value less than 0.0001. Thus, unstructured variance– covariance model will be used hereafter. The random effects of the preliminary level-2 model include intercept, uwtappl, quadratic term of uwtappl, and FT. The variance–covariance structure is a general positive-definite matrix. Putting the preliminary level-1 and level-2 models together, the preliminary linear-mixed-effects model is then: PSIij ¼ c00 þ c10 ðthickÞij þ c20 ðbasethkÞij þ c30 ðsubasthkÞij þ c40 ðuwtapplÞij þ c50 ðuwtapplÞ2ij þ c60 ðFTÞij þ c70 ðthick basethkÞij þ c80 ðthick subasthkÞij þ c90 ðbasethk uwtapplÞij þ c100 ðsubasthk uwtapplÞij þ c110 ðbasethk subasthk uwtapplÞij þ c120 ðthick basethk subasthk uwtapplÞij þ U0j þ U4j ðuwtapplÞij þ U5j ðuwtapplÞ2ij þ U6j ðFTÞij þ Rij ð29:7Þ

29.5.3 Selecting a Residual Covariance Structure The absolute value of log-likelihood for this heteroscedastic model is 6273.29. The need of heteroscedastic model can be formally checked by using the likelihood ratio test [7]. The small p-value indicates that the heteroscedastic model explains the data significantly better than homoscedastic model. Correlation structures are used to model dependence among the within-subject errors. Autoregressive model with order of 1, called AR(1), is the simplest and one of the most useful models [8]. The autocorrelation function (ACF), which begins autocorrelation at lag 1 and then declines geometrically, for AR(1) is particularly simple. Autocorrelation functions for autoregressive model of order greater than one are typically oscillating or sinusoidal functions and tend to damp out with increasing lag [20].

360

H.-W. Ker and Y.-H. Lee

Thus, AR(1) may be adequate to model the dependency of the within-subject errors. The absolute value of log-likelihood for this heteroscedastic AR(1) model is 6207.24. The estimated single correlation parameter / is 0.125. The heteroscedastic model (corresponding to / = 0) is nested within the heteroscedastic AR(1) model. Likewise, the need of heteroscedastic AR(1) model can be checked using likelihood ratio test [7]. The small p-value indicates that the heteroscedastic AR(1) model explains the data significantly better than heteroscedastic model, suggesting that within-group serial correlation is present in the data.

29.5.4 Model Reduction After specifying the within-subject error structure, the next step is to check whether the random-effects can be simplified. It is also desirable to reduce the number of parameters in fixed effects in order to achieve a parsimonious model that can well represent the data. A likelihood ratio test statistic, whose sampling distribution is a mixture of two chi-squared distributions, is used to test the need for random-effects. The p-value is determined by equal weight of the p-values of a mixture of two chi-squared distributions. To assess the significance of the terms in the fixed effects, conditional t-tests are used.

29.5.4.1 Reduction of Random Effects The matrix of known covariates should not have polynomial effect if not all hierarchically inferior terms are included [21]. The same rule applies to interaction terms. Hence, significance tests for higher-order random effects should be performed first. The random effects included in the preliminary random-effects structure are: intercept, uwtappl, uwtappl2, and FT. The models and the associated maximum log-likelihood values are compared [7]. The small p-value indicates that the preliminary random-effects structure explains the data significantly better than the others. Thus, no reduction of random effects is needed.

29.5.4.2 Reduction of Fixed Effects An adequate and appropriately specified random-effects structure implies efficient model-based inferences for the fixed effects. When considering the reduction of fixed effects, one model is nested within the other model and the random-effects structures are the same for the full and the reduced models. Likelihood ratio tests are appropriate for the model comparison. The parameter estimates, estimated standard errors, t-statistics and p-values for the fixed effects of the heteroscedastic AR(1) model are revisited. The heteroscedastic AR(1) model can be reduced to a

29

Preliminary Analysis of Flexible Pavement Performance Data

361

Table 29.2 Proposed preliminary LME model Intercept Random effects Standard deviation

0.170

uwtappl 1.679

uwtappl2

FT

0.765

Residual

0.00722

0.448

Parameter

Value

Std. error

DF

t-Value

p-Value

Fixed effects (Intercept) thick basethk subasthk uwtappl uwtappl2 FT thick*basethk thick*subasthk basethk*uwtappl subasthk*uwtappl thick*basethk*uwtappl thick*subasthk*uwtappl basethk*subasthk*uwtappl thick*basethk*subasthk*uwtappl

2.4969 0.2629 0.0590 0.0386 -3.6191 1.1524 0.0148 -0.0062 -0.0082 0.1275 0.1355 -0.0155 -0.0077 -0.0291 0.0073

0.0703 0.0122 0.0066 0.0041 0.5254 0.2481 0.0023 0.0016 0.0010 0.0172 0.0181 0.0045 0.0036 0.0029 0.0006

9423 9423 9423 9423 9423 9423 9423 9423 9423 9423 9423 9423 9423 9423 9423

35.51 21.48 8.91 9.37 -6.89 4.65 6.39 -3.81 -8.07 7.40 7.50 -3.43 -2.16 -9.87 11.53

\0.0001 \0.0001 \0.0001 \0.0001 \0.0001 \0.0001 \0.0001 \0.0001 \0.0001 \0.0001 \0.0001 0.0006 0.0307 \0.0001 \0.0001

Note Model fit: AIC = 12481.77, BIC = 12710.69, logLik = -6208.89. Correlation structure: AR(1); parameter estimate(s): Phi = 0.126. Variance function structure: for different standard deviations per stratum (thick = 2, 1, 3, 4, 5, 6 in.), the parameter estimates are: 1, 1.479, 0.935, 1.199, 0.982, 0.959

more parsimonious model due to the existence of some insignificant parameter estimates. The reduction of fixed effects starts with removing the parameters with largest p-values, insignificant terms, and combining the parameters not changing significantly. These processes are repeated until no important terms have been left out of the model.

29.5.5 Proposed Preliminary LME Model The final proposed preliminary linear mixed-effects model is listed in Table 29.2. The fixed-effects structures of the proposed model contain significant treatment effects for thick, basethk, subasthk, uwtappl, uwtappl2, FT, and several other two-, three-, and four-way interaction terms. The positive parameter estimates for thick, basethk, and subasthk indicates that higher mean PSI values tend to occur on thicker pavements. The parameter estimate of uwtappl is negative indicating that lower PSI values for higher load applications. Furthermore, the preliminary LME model also indicates that: The standard error for the pavements with surface thickness of 1 in. or 4 in. is about 48 or 20% higher

362

H.-W. Ker and Y.-H. Lee

than those with surface thickness of 2 in., respectively. There exists dependency in within-subject errors. The estimated single correlation parameter for the AR(1) model is 0.126.

29.5.6 Examination of the Model Fit A plot of the population predictions (fixed), within-group predictions (Subject), and observed values versus time for the proposed preliminary LME model by subjects. Population predictions are obtained by setting random-effects to zero whereas within-group predictions use estimated random effects [7]. The prediction line of the within-group predictions follows the observed values more closely indicating the proposed LME model provides better explanation to the data.

29.6 Conclusions A systematic modeling approach using visual-graphical techniques and LME models which is generally applicable to modeling multilevel longitudinal data with a large number of time points was proposed in this paper. The original AASHO road test flexible pavement data was used to illustrate the proposed modeling approach. Exploratory analysis of the data indicated that most subjects (loop/lane) have higher mean PSIs at the beginning of the observation period, and they tend to decrease over time. The spread among the subjects is substantially smaller at the beginning than that at the end. In addition, there exist noticeable variations among subjects. A preliminary LME model for PSI prediction was developed. The positive parameter estimates for thick, basethk, and subasthk indicates that higher mean PSI values tend to occur on thicker pavements. The parameter estimate of uwtappl is negative indicating that lower PSI values for higher load applications. The prediction line of the within-group predictions (Subject) follows the observed values more closely than that of the population predictions (fixed) indicating the proposed LME model provides better explanation to the data.

References 1. Goldstein H (1979) The design and analysis of longitudinal studies. Academic Press, Inc, New York 2. Hox JJ (2000) Multilevel analysis of grouped and longitudinal data. In: Little TD, Schnabel KU, Baumert J (eds) Modeling longitudinal and multilevel data: practical issues, applied approaches and specific examples. Lawrence Erlbaum Associates, Mahwah, pp 15–32 3. Highway Research Board (1962) The AASHO road test, report 5, pavement research, special report 61E. National Research Council, Washington

29

Preliminary Analysis of Flexible Pavement Performance Data

363

4. Ker HW (2002) Application of regression spline in multilevel longitudinal modeling. Doctoral Dissertation, University of Illinois, Urbana 5. Lee YH, Ker HW (2008) Reevaluation and application of the AASHTO mechanisticempirical pavement design guide, phase I, summary report, NSC96-2211-E-032-036. National Science Council, Taipei City (In Chinese) 6. Lee YH, Ker HW (2009) Reevaluation and application of the AASHTO mechanisticempirical pavement design guide, phase II, NSC97-2221-E-032-034, summary report. National Science Council, Taipei City (In Chinese) 7. Ker HW, Lee YH (2010) Preliminary analysis of AASHO road test flexible pavement data using linear mixed effects models. In: Lecture notes in engineering and computer science: proceedings of the world congress on engineering 2010, WCE 2010, 30 June–2 July, London, UK, pp 260–266 8. Pinherio JC, Bates DM (2000) Mixed-effects models in S and S-plus. Springer, New York 9. Laird NM, Ware JH (1982) Random effects models for longitudinal data. Biometrics 38: 963–974 10. Anderson CJ (2001) Model building. http://www.ed.uiuc.edu/courses/edpsy490ck 11. MacCallum RC, Kim C (2000) Modeling multivariate change. In: Little TD, Schnabel KU, Baumert J (eds) Modeling longitudinal and multilevel data: practical issues, applied approaches and specific examples. Lawrence Erlbaum Associates, NJ, pp 51–68 12. Jones RH (1993) Longitudinal data with serial correlation: a state-space approach. Chapman & Hall, London 13. Vonesh EF, Chinchilli VM (1997) Linear and nonlinear models for the analysis of repeated measurements. Marcel Dekker, Inc, New York 14. Carlin BP, Louis TA (1996) Bayes and empirical Bayes methods for data analysis. Chapman & Hall, London 15. Goldstein H, Healy MJR, Rasbash J (1994) Multilevel time series models with application to repeated measures data. Stat Med 13:1643–1655 16. Huang YH (2004) Pavement analysis and design, 2nd edn. Pearson Education, Inc., Upper Saddle River 17. ARA, Inc (2004) ERES consultants division, guide for mechanistic-empirical design of new and rehabilitated pavement structure. NCHRP 1–37A report. Transportation Research Board, National Research Council, Washington 18. Lee YH (1993) Development of pavement prediction models. Doctoral dissertation, University of Illinois, Urbana 19. Ker HW, Lee YH, Wu PH (2008) Development of fatigue cracking performance prediction models for flexible pavements using LTPP database. J Transp Eng ASCE 134(11):477–482 20. Pindyck RS, Rubinfeld DL (1998) Econometric models and economic forecasts, 4th edn. McGraw-Hill, Inc, New York 21. Morrell CH, Pearson JD, Brant LJ (1997) Linear transformations of linear mixed-effects models. Am Stat 51:338–343

Chapter 30

Chi-Squared, Yule’s Q and Likelihood Ratios in Tabular Audiology Data Muhammad Naveed Anwar, Michael P. Oakes and Ken McGarry

Abstract In this chapter, we have used the chi-squared test and Yule’s Q measure to discover associations in tables of patient audiology data. These records are examples of heterogeneous medical records, since they contain audiograms, textual notes and typical relational fields. In our first experiment we used the chisquared measure to discover associations between the different fields of audiology data such as patient gender and patient age with diagnosis and the type of hearing aid worn. Then, in our second experiment we used Yule’s Q to discover the strength and direction of the significant associations found by the chi-squared measure. Finally, we examined the likelihood ratio used in Bayesian evidence evaluation. We discuss our findings in the context of producing an audiology decision support system.

M. N. Anwar (&) M. P. Oakes Department of Computing, Engineering & Technology, University of Sunderland, Sunderland, UK e-mail: [email protected] M. P. Oakes e-mail: [email protected] K. McGarry Department of Pharmacy, Health and Well-Being, University of Sunderland, Sunderland, UK e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_30, Ó Springer Science+Business Media B.V. 2011

365

366

M. N. Anwar et al.

30.1 Introduction Association measures can be used to measure the strength of relationship between the variables in medical data. Discovering associations in medical data has an important role in predicting the patient’s risk of certain diseases. Early detection of any disease can save time, money and painful procedures [1]. In our work we are looking for significant associations in heterogeneous audiology data with the ultimate aim of looking for factors influencing which patients would most benefit from being fitted with a hearing aid. Support and confidence are measures of the interestingness of associations between variables [2, 3]. They show the usefulness and certainty of discovered associations. Strong associations are not always interesting, because support and confidence do not filter out uninteresting associations [4]. Thus, to overcome this problem a correlation measure is augmented to support and confidence. One of the correlation measures popularly used in the medical domain is chisquared (v2). In Sect. 30.2 we describe our database of audiology data. We first use the chisquared measure to discover significant associations in our data, as described in Sect. 30.3. We then use Yule’s Q measure to discover the strength of each of our significant associations, as described in Sect. 30.4. In Sect. 30.5, we briefly describe our findings for the support and confidence for each of the significant associations. In Sect. 30.6, we use Bayesian likelihood ratios to find associations between words in the comments fields and the type of hearing aid fitted. We draw our conclusions in Sect. 30.7.

30.2 Audiology Data In this study, we have made use of audiology data collected at the hearing aid outpatient clinic at James Cook University Hospital in Middlesbrough, England, UK. The data consists of about 180,000 individual records covering about 23,000 audiology patients. The data in the records is heterogeneous, consisting of the following fields: 1 Audiograms, which are the graphs of hearing ability at different frequencies (pitches). 2 Structured data: gender, date of birth, diagnosis and hearing aid type, as stored in a typical database, e.g. |M|, |09-05-1958|, |TINNITUS|, |BE18|. 3 Textual notes: specific observations made about each patient, such as |HEARING TODAY NEAR NORMAL—USE AID ONLY IF NECESSARY|. In general, these audiology records represent all types of medical records because they involve both structured and unstructured data.

30

Chi-Squared, Yule’s Q and Likelihood Ratios in Tabular Audiology Data

367

30.3 Discovery of Associations with the Chi-Squared Test Tables The chi-squared test is a simple way to provide estimates of quantities of interest and related confidence intervals [5]. It is a measure of associations between variables (such as the fields of the tables in a relational database) where the variables are nominal and related to each other [6]. The Chi-squared test is popular in the medical domain because of its simplicity. It has been used in pharmacology to classify text according to subtopics [7]. The resulting chi-squared value is a measure of the differences between a set of observed and expected frequencies within a population, and is given by the formula [5]: v2 ¼

r X c X ðOij Eij Þ2 i¼1 j¼1

Eij

where r is the number of unique terms in a particular field of the patient records such as diagnosis or hearing aid type, corresponding to rows in Table 30.1. c is the number of categories in the data (such as age or gender) corresponding to columns in Table 30.1. Table 30.1 is produced for two diagnoses occurring in the hearing diagnosis field. For example, if 535 of the hearing diagnosis fields of the records of patients ‘Aged B 54’ years contained the diagnosis ‘tinnitus’, we would record a value of 535 for that term being associated with that category. These values were the ‘observed’ values, denoted Oij in the formula above. The corresponding ‘expected’ values Eij were found by the formula: Row total Column total=Grand Total The row total for ‘tinnitus’ diagnosis is the total number of times the ‘tinnitus’ diagnosis was assigned to patients in both age categories = 535 ? 592 = 1127. The column total for ‘Age B 54’ is the total number of patients in that age group over all two diagnoses = 702. The grand total is the total number of patient records in the study = 1364. Thus the ‘expected’ number of patients diagnosed with ‘tinnitus’ in the ‘Age B 54’ group was 1127 * 702/1364 = 580.02. The significance of this is that the expected value is greater than the observed value, suggesting that there is a negative degree of association between the ‘tinnitus’

Table 30.1 Observed and expected frequencies for diagnosis Diagnosis Age B 54 Age [ 54

Row total

Not-tinnitus Tinnitus Column total

237 1127 1364

167 (121.98) [2027.24] 535 (580.02) [2027.24] 702

70 (115.02) [2027.24] 592 (546.98) [2027.24] 662

Expected frequencies are in ( ); (observed frequency - expected frequency)2 are in [ ]

368

M. N. Anwar et al.

diagnosis and the category ‘Age B 54’. The remainder of the test is then performed to discover if this association is statistically significant. Since we were in effect performing many individual statistical tests, it was necessary to use the Bonferroni correction [5] to control the rate of Type I errors where a pair of variables spuriously appear to be associated. For example, for us to be 99.9% confident that a particular keyword was typical of a particular category, the corresponding significance level of 0.001 had to be divided by the number of simultaneous tests, i.e. the number of unique words times the number of categories. In the case of words in the text fields, this gave a corrected significance level of 0.001/(2 * 2) = 0.00025. Using West’s chi-squared calculator [8], for significance at the 0.001 level with one degree of freedom, we obtained a chi-squared threshold of 13.41. Thus each word associated with a category with a chi-squared value of more than 13.41 was taken to be significantly associated with that category at the 0.001 level. The overall chi-squared values for the relationship between the test variables age and gender with hearing aid type (behind the ear—BTE/in the ear—ITE) are shown in Table 30.2. The overall chi-squared value for the relationship between the words in the comments text and hearing aid type was calculated by summing the chi-squared values of all possible text word—BTE/ITE right aid pairs, and is also shown in Table 30.2. This data shows, with 99.9% confidence, that these text words were not randomly distributed, but some text words are probably associated with hearing aid type. Similarly the associations of each of the variables (age, comments text, gender and hearing aid type) with tinnitus diagnosis are shown in Table 30.3. Here we see that there are significant associations between age, comments text, and BTE/ITE right aid with a diagnosis of tinnitus, but there are no significant associations between gender and tinnitus diagnosis.

Table 30.2 Overall v2 with BTE/ITE right aid

Table 30.3 Overall v2 with tinnitus diagnosis

Fields

Overall v2

Degrees of freedom (df)

P

Age Comments text Gender

10.53 5421.84 33.68

1 663 1

\0.001 \0.001 \0.001

Fields

Overall v2

Degrees of freedom (df)

P

Age Comments text Gender BTE/ITE raid

41.45 492.26 0.18 31.75

1 60 1 1

\0.001 \0.001 =0.6714 \0.001

30

Chi-Squared, Yule’s Q and Likelihood Ratios in Tabular Audiology Data

369

To use the chi-squared test the expected frequency values must be all at least 1, and most should exceed 5 [9]. To be on the safe side, we insisted that for each word, all the expected values should be at least 5, so all words failing this test were grouped into a single class called ‘OTHERS’. Keywords associated with categories with 95% confidence were deemed typical of those categories if O [ E, otherwise they were deemed atypical. The keywords most typical and atypical of the four categories (hearing aid type, age, tinnitus and gender) are shown in Tables 30.4 and 30.5. A ‘keyword’ could either be a category type (where * denotes a diagnosis category, and *** denotes a hearing aid category), or a word from the free-text comments field (denoted **). The discovered associations seem intuitively reasonable. For example, it appears that the patients with ‘Age B 54’ tend not to have tinnitus, and patients not having tinnitus had a problem of wax and were using BTE hearing aids. The words tinnitus (ringing in the ears) and masker (a machine for producing white noise to drown out tinnitus) were atypical of this category. It was found that males tended more to use ITE hearing aids and females tended more to use BTE hearing aids. The hearing aid types associated with BTE were those with high gain and had changes made to the ear mould. Similarly, ITE hearing aid types used lacquer, vents, required reshelling of ear impressions, had changes made to the hearing aid, were reviewed and the wearers were making progress. For these experiments, we used all the records available in the database for each field under study, keeping the criterion that none of the field values should be empty. In Table 30.4, 70 was calculated as the median age of the BTE/ITE right aid group and in Table 30.5, 54 was the median age of the records with a nottinnitus or tinnitus diagnosis. In Tables 30.4 and 30.5 some keywords in the comments text were abbreviations such as ‘reshel’ for ‘reshell’ and ‘fta’ for ‘failed to attend appointment’. ‘Tinnitus’ appears as ‘tinnitu’ in the tables, since all the text was passed through Porter’s stemmer [10] for the removal of grammatical endings.

Table 30.4 Categories with positive and negative keywords in records with BTE/ITE right aid Positive keywords Negative keywords Age B 70 Age [ 70 BTE

ITE

Male Female

*Not found *Not found **mould, be34, map, gp, 92, audio, inf, be52, ref, staff, reqd, be36, contact **fta, reshel, appt, it, nn, nfa, 2001, rev, lacquer, hn, km, imp, review, 2000, nh, vent, progress, aid, dt, taken ***ITE ***BTE

*Not found *Not found **fta, reshel, appt, it, nn, nfa, 2001, rev, lacquer, hn, km, imp, review, 2000 **mould, be34, map, gp, 92, audio, inf, be52, ref, staff, reqd, be36, contact, tri, n, order ***BTE ***ITE

370

M. N. Anwar et al.

Table 30.5 Categories with positive and negative keywords in records with a tinnitus/not-tinnitus diagnosis Positive keywords Negative keywords Age B 54 Age [ 54 Not-tinnitus

*Not-tinnitus *Not found **OTHERS, lost, ear, wax, L, aid ***BTE

Tinnitus

**masker, tinnitu ***Not found ***Not found ***Not found

Male Female

*Not found *Not-tinnitus **masker, tinnitu, rev, help, appt, 2001, 2000, counsel, ok, further, progress, fta ***ITE **OTHERS ***Not found ***Not found ***Not found

30.4 Measures of Association in Categorical Data Yule’s Q is a measure to find the strength of association between categorical variables. Unlike the chi-squared test, which tells us how certain we can be that a relationship between two variables exists, Yule’s Q gives both the strength and direction of that relationship [6]. In the following 2 9 2 table,

Present Absent

Present

Absent

A C

B D

Yule’s Q is given by Q¼

AD BC AD þ BC

ð2Þ

where A, B, C and D are the observed quantities in each cell. Yule’s Q is in the range -1 to +1, where the sign indicates the direction of the relationship and the absolute value indicates the strength of the relationship. Yule’s Q does not distinguish complete associations (where one of the cell values = 0) and absolute relationships (where two diagonally opposite cell values are both zero), and is only suitable for 2 9 2 tables. In Tables 30.6, 30.7, 30.8, and 30.9 Yule’s Q values for age with comment text, diagnosis, hearing aid type, and mould are given. Similarly, in the Table 30.10, 30.11, and 30.12 Yule’s Q values for gender with comment text, hearing aid type and mould are given. ‘(P)’ and ‘(A)’, stand for present and absent.

30

Chi-Squared, Yule’s Q and Likelihood Ratios in Tabular Audiology Data

371

In Table 30.6, a Yule’s Q value of 0.75 shows that there is a positive association between the keyword ‘progress’ and the category ‘Age B 70’, which can be restated as a negative association between the keyword ‘progress’ and the category ‘Age [ 70’. In Table 30.7, for ‘diagnosis’ there is an absolute association between ‘familial’ and ‘Age B 54’, resulting in a Yule’s Q value of 1. This should be viewed in comparison to the chi-squared value for the same association, 17.20 (P \ 0.001), showing both that the association is very strong and that we can be highly confident that it exists. The presence of this association shows that a higher proportion of younger people report to the hearing aid clinic with familial (inherited) deafness than older people. Familial deafness is relatively rare but can affect any age group, while ‘OTHERS’ would include ‘old-age deafness’ (presbycusis) which is relatively common, but obviously restricted to older patients. However, in Table 30.9, Yule’s Q for ‘V2’ is 0.18, which shows only a weak association between mould and ‘Age B 70’, while the chi-squared value for the same association of 30.25 (P \ 0.001), showed that it is highly likely that the association exists. In Table 30.6 Yule’s Q for comment text and age Comment text Age B 70 (P) Age [ 70 (P)

Age B 70 (A)

Age [ 70 (A)

Yule’s Q

Progress Dna Masker Tinnitus Help Counsel 2000 Fta Gp Wax Ref Contact Insert Reqd Cic Staff Map Dv Reinstruct

46833 46821 46361 46541 46704 46735 46638 46384 46162 46191 46284 46495 46509 46517 46522 46515 46517 46503 46524

45555 45548 45442 45445 45484 45488 45443 45236 55060 55074 55188 55546 55573 55564 55599 55543 55550 55430 55607

0.75 0.67 0.63 0.51 0.44 0.40 0.38 0.23 -0.16 -0.19 -0.24 -0.49 -0.58 -0.72 -0.73 -0.73 -0.75 -0.75 -0.75

93 105 565 385 222 191 288 542 370 341 248 37 23 15 10 17 15 29 8

13 20 126 123 84 80 125 332 615 601 487 129 102 111 76 132 125 245 68

Table 30.7 Yule’s Q for diagnosis and age Diagnosis Age B 54 (P) Age [ 54 (P)

Age B 54 (A)

Age [ 54 (A)

Yule’s Q

Familial OTHERS

684 589

662 618

1.00 0.46

18 113

0 44

372

M. N. Anwar et al.

Table 30.8 Yule’s Q for hearing aid type and age Hearing aid type Age B 70 (P) Age [ 70 (P) Age B 70 (A)

Age [ 70 (A)

Yule’s Q

PFPPCL PPCL BE101 PPC2 ITENL OTHERS ITEHH – BE34 ITENH ITENN BE36

10899 10895 10896 10894 10865 10863 10583 6953 10018 10308 9837 10697

0.95 0.88 0.83 0.79 0.55 0.46 0.26 0.12 -0.18 -0.21 -0.25 -0.37

42 78 44 53 123 103 536 4668 640 403 683 97

1 5 4 6 35 37 317 3947 882 592 1063 203

11105 11069 11103 11094 11024 11044 10611 6479 10507 10744 10464 11050

Table 30.9 Yule’s Q for mould and age Mould Age B 70 (P) Age [ 70 (P)

Age B 70 (A)

Age [ 70 (A)

Yule’s Q

N8 SIL V2 2107V1

10873 10879 10559 10533

10805 10798 10502 9986

0.47 0.43 0.18 -0.23

261 255 575 601

94 101 397 913

Table 30.10 Yule’s Q for comment text and gender Comment text M (P) F (P) M (A)

F (A)

Yule’s Q

He Wife Dv

55673 55673 55421

0.95 0.93 -0.45

Table 30.11 Yule’s Q for hearing aid type and gender Hearing aid type M (P) F (P) M (A)

F (A)

Yule’s Q

ITEHH ITENH ITEHN ITENN

12467 12373 10936 11630

0.58 0.47 -0.13 -0.14

67 44 80

665 725 1280 734

2 2 254

201 295 1732 1038

46465 46488 46452

11080 11020 10465 11011

Table 30.11, Yule’s Q for ‘ITEHN’ (a type of hearing aid worn inside the ear) is -0.13, which shows a weak negative association between ‘ITEHN’ and ‘male’, or in other words, a weak positive association between ‘ITEHN’ and ‘female’. In comparison, the chi-squared value for the same association of 43.36 (P \ 0.001), showed that we can be highly confident that the relationship exists. These results show the complementary nature of the chi-squared and Yule’s Q results: in all three cases the chi-squared value was highly significant, suggesting

30

Chi-Squared, Yule’s Q and Likelihood Ratios in Tabular Audiology Data

373

Table 30.12 Yule’s Q for mould and gender Mould M (P) F (P)

M (A)

F (A)

Yule’s Q

IROS V2 N8

11671 11111 11498

12644 12326 12527

0.57 0.35 0.32

80 640 253

24 342 141

that the relationship was highly likely to exist, while Yule’s Q showed the strength (strong in the first case, weak in the others) and the direction (positive in the first two cases, negative in the third) of the relationship differed among the three cases.

30.5 Support and Confidence for Associations We examined two measures of association commonly used in market basket analysis, support and confidence [4], for all relations between age and diagnosis, and gender and diagnosis. We were unable to find many rules with high support and confidence due to the very high proportion of one type of diagnosis (‘tinnitus’) in the records. However, we feel that given an audiology database where a diagnosis was routinely recorded for every patient, more rules in the form A ) B (A implies B) would be found. Our results are given in [11].

30.6 Likelihood Ratios for Associated Keywords In Bayesian Evidence Evaluation [6], the value of a piece of evidence may be expressed as a likelihood ratio (LR), as follows: LR ¼ PrðE=HÞ= PrðE=HÞ For example, our hypothesis (H) might be that a patient should be fitted with a BTE hearing aid as opposed to an ITE hearing aid. E is a piece of evidence such as the word ‘tube’ appearing in the patient’s comments field of the database. Pr(E/H) is then the probability of seeing this evidence given that the hypothesis is true. Of all the 34394 records where a patient was given a BTE aid, 29 of them contained is the the word ‘tube’, so in this case Pr(E/H) = 29/34394 = 0.000843. PrðE=HÞ probability of seeing the word ‘tube’ when the hypothesis is not true. Of all the 29455 records where a patient was given an ITE aid, only 2 of them was 2/29455 = 0.0000679. This gives contained the word ‘tube’, so here PrðE=HÞ an LR of 0.000843/0.0000675 = 12.41. Using Evett et al.’s [12] scale of verbal equivalences of the LR, an LR in the range 10–100 indicates moderate support for the hypothesis. LRs in the range 0.1–10 indicate only limited support either way, while an LR in the range 0.01 to 0.1 would indicate moderate support for the complementary hypothesis. The words giving the highest and lowest LR values

374 Table 30.13 Likelihood ratios for comments text and BTE/ITE right aids

M. N. Anwar et al. Word

BTE

ITE

LR

Adequ Audiometer Be10 Be201 Be301 Be37 Be51 Hac Temporari Therapy Be52 Be53 Be36 Be54 Retub Seri Cwc Tube Couldn’t Orig ‘‘map Map E Hn Progress Readi Concertina Unless Coat Cap Vc Hnv1 Hh Reshel Lacquer Facepl Window Total

14 10 18 18 13 12 13 11 11 13 68 26 57 35 34 16 15 29 14 14 13 116 12 8 4 1 1 1 1 1 1 1 1 6 2 0 0 34394

0 0 0 0 0 0 0 0 0 0 2 1 3 2 2 1 1 2 1 1 1 9 1 77 39 10 11 11 13 15 15 17 20 136 65 15 16 29445

NA NA NA NA NA NA NA NA NA NA 29.11 22.26 16.27 14.98 14.55 13.70 12.84 12.41 11.99 11.99 11.13 11.03 10.27 0.09 0.09 0.09 0.08 0.08 0.07 0.06 0.06 0.05 0.04 0.04 0.03 0 0

with respect to a BTE fitting as opposed to an ITE fitting are shown in Table 30.13, where NA indicates division by zero as the word never appeared in records for patients fitted with an ITE hearing aid. All words which were used in the chi-squared analysis (since their expected values were all 5 or more) were also considered for this analysis. LR values are useful for the combination of evidence. Using the evidence that the text comments field contains ‘lacquer’, ‘reshell’ and ‘progress’, we can

30

Chi-Squared, Yule’s Q and Likelihood Ratios in Tabular Audiology Data

375

estimate the likelihood of the patient requiring a BTE hearing aid by iteratively using the relationship ‘posterior odds = LR 9 prior odds’. Initially we obtain a prior odds (Pr(BTE)/Pr(ITE)) from a large sample or manufacturer’s data. Using the column totals in Table 30.13, the prior odds in favour of a BTE aid before any other evidence has been taken into account would be 34394/29445 = 1.168 to 1. Taking the first piece of evidence (the presence of the word ‘lacquer’ into account), the posterior odds are 0.03 9 1.168 = 0.035. This posterior odds value now becomes the prior odds for the second iteration. The LR for ‘reshell’ is 0.04, so the posterior odds become 0.04 9 0.035 = 0.0014. This posterior odds value now becomes the prior odds for the third iteration. The LR for ‘progress’ is 0.09, so the final posterior odds become 0.09 9 0.0014 = 0.000126. Since these posterior odds are much less than 1, it is much more likely that the patient should be fitted with an ITE hearing aid. This simple example shows the basis by which a Bayesian decision support system which returns the more suitable type of hearing aid could be constructed.

30.7 Conclusion In this work we have discovered typical and atypical words related to different fields of audiology data, by first using the chi-squared measure to show which relations most probably exist, then using Yule’s Q measure of association to find the strength and direction of those relations. The Likelihood Ratio, also based on the contingency table, provides a means whereby all the words in the comments field can be taken into account in a Bayesian decision support system for audiologists. We are currently working on the development of a Logistic Regression model, where the overall value log(Pr(BTE)/Pr(ITE)) will be a linear combination of the presence or absence of each of the discovered associated variables described in this chapter. Analogous reasoning will be used for models to predict whether or not a patient should be given a tinnitus masker, and whether or not he or she would benefit from a hearing aid fitting. Rules found by data mining should not only be accurate and comprehensible, but also ‘surprising’. McGarry presents a taxonomy of ‘interestingness’ measures whereby the value of discovered rules may be evaluated [13]. In this chapter, we have looked at objective interestingness criteria, such as the statistical significance of the discovered rules, but we have not yet considered subjective criteria such as unexpectedness and novelty. These require comparing machine-derived rules with the prior expectations of domain experts. A very important subjective criterion is ‘actionability’, which includes such considerations as impact: will the discovered rules lead to any changes in current audiological practice? Acknowledgments We wish to thank Maurice Hawthorne, Graham Clarke and Martin Sandford at the Ear, Nose and Throat Clinic at James Cook University Hospital in Middlesbrough, England, UK, for making the large set of audiology records available to us.

376

M. N. Anwar et al.

References 1. Pendharkar PC, Rodger JA, Yaverbaum GJ, Herman N, Benner M (1999) Association, statistical, mathematical and neural approaches for mining breast cancer patterns. Expert Syst Appl, Elsevier Science Ltd 17:223–232 2. Bramer M (2007) Principles of data mining. Springer, London, pp 187–218 3. Ordonez C, Ezquerra N, Santana CA (2006) Constraining and summarizing association rules in medical data. In: Cercone N et al (eds) Knowledge and information systems. Springer, New York, pp 259–283 4. Han J, Kamber M (2006) Data mining concepts and techniques, 2nd edn. Morgan Kaufmann Publishers, San Diego, pp 227–272 5. Altman DG (1991) Practical statistics for medical research. Chapman & Hall, London, pp 241–248, 211, 271 6. Lucy D (2005) Introduction to statistics for forensic scientists. Wiley, Chichester, pp 45–52,112–114,133–136 7. Oakes M, Gaizauskas R, Fowkes H et al (2001) Comparison between a method based on the chi-square test and a support vector machine for document classification. In: Proceedings of ACM SIGIR, New Orleans, pp 440–441 8. Chi-square calculator (2010). http://www.stat.tamu.edu/*west/applets/chisqdemo.html 9. Agresti A (2002) Categorical data analysis, 2nd ed. Wiley series in probability and statistics. Wiley, New York, p 80 10. Porter MF (1980) An algorithm for suffix stripping. Program 14(3):130–137 11. Anwar MN, Oakes MP, McGarry K (2010) Chi-squared and associations in tabular audiology data. In: Lecture notes in engineering and computer science: proceedings of the world congress on engineering 2010, WCE 2010, London, UK, vol 1, pp 346–351 12. Evett IW, Jackson G, Lambert JA, McCrossan S (2000) The impact of the principles of evidence interpretation and the structure and content of statements. Sci Justice 40:233–239 13. McGarry K (2005) A survey of interestingness measures for knowledge discovery. Knowl Eng Rev J 20(1):39–61

Chapter 31

Optimising Order Splitting and Execution with Fuzzy Logic Momentum Analysis Abdalla Kablan and Wing Lon Ng

Abstract This study proposes a new framework for high frequency trading using a fuzzy logic based momentum analysis system. An order placement strategy will be developed and optimised with adaptive neuro fuzzy inference in order to analyse the current ‘‘momentum’’ in the time series and to identify the current market condition which will then be used to decide the dynamic participation rate given the current traded volume. The system was applied to trading of financial stocks, and tested against the standard volume based trading system. The results show how the proposed Fuzzy Logic Momentum Analysis System outperforms the standard volume based systems that are widely used in the financial industry.

31.1 Introduction The modelling of financial systems continues to hold great interest not only for researchers but also for investors and policymakers. Many of the characteristics of these systems, however, cannot be adequately captured by traditional financial modelling approaches. Financial systems are complex, nonlinear, dynamically changing systems in which it is often difficult to identify interdependent variables and their values. In particular, the problem of optimal order execution has been a main concern for financial trading and brokerage firms for decades [1]. The idea of executing a A. Kablan (&) W. L. Ng Centre for Computational Finance and Economic Agents (CCFEA), University of Essex, Wivenhoe Park, Colchester, CO4 3SQ, UK e-mail: [email protected] W. L. Ng e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_31, Springer Science+Business Media B.V. 2011

377

378

A. Kablan and W. L. Ng

client’s order to buy or sell a pre-specified number of shares at a price better than all other competitors seems intriguing. However, this involves the implementation of a system that considers the whole price formation process from a different point of view. Financial brokers profit from executing clients’ orders of buying and selling of certain amounts of shares at the best possible price. Many mathematical and algorithmic systems have been developed for this task [2], yet they seem not to be able to overcome a standard volume based system. Most systems use well-documented technical indicators from financial theory for their observations. For example, [3] used three technical indicators in their stock trading system: the rate of change, the stochastic momentum indicator and a support-resistance indicator that is based on the 30-day price average. A convergence module then maps these indices as well as the closing price to a set of inputs for the fuzzy system, thus providing a total of seven inputs. In some cases, such as the rate of change, one indicator maps to a single input. However, it is also possible to map one indicator to multiple inputs. Four levels of quantification for each input value are used: small, medium, big and large. In this case, Mamdani’s form of fuzzy rules [4] can be used to combine these inputs and produce a single output variable with a value between 0 and 100. Low values indicate a strong sell, high values a strong buy. The system is evaluated using 3 years of historical stock price data from four companies with variable performance during one period and employing two different strategies (risk-based and performance-based). In each strategy, the system begins with an initial investment of $10,000 and assumes a constant transaction cost of $10. Similarly, tax implications are not taken into consideration. The resulting system output is shown to compare favourably with stock price movement, outperforming the S&P 500 in the same period. The application presented in this study differs from the above, as it introduces a fuzzy logic-based system for the momentum analysis [5]. The system uses fuzzy reasoning to analyse the current market conditions according to which a certain equity’s price is currently moving. This is then used as a trading application. First, the membership functions were decided by the expert-based method, but then later optimised using ANFIS [6], further improving the trading strategy and order execution results.

31.2 Fuzzy Logic Momentum Analysis System (Fulmas) 31.2.1 Fuzzy Inference A fuzzy inference system is a rule-based fuzzy system that can be seen as an associative memory and is comprised of five components: • A rule base which consists of the fuzzy if–then rules. • A database which defines membership functions of the fuzzy sets used in the fuzzy rules.

31

Optimising Order Splitting and Execution

379

• A decision-making unit which is the core unit and is also known as the inference engine. • A fuzzification interface which transforms crisp inputs into degrees of matching linguistic values. • A defuzzification interface which transforms fuzzy results into crisp output. Many types of fuzzy inference systems have been proposed in literature [7]. However, in the implementation of an inference system, the most common is the Sugeno model, which makes use of if–then rules to produce an output for each rule. Rule outputs consist of the linear combination of the input variables as well as a constant term; the final output is the weighted average of each rule’s output. The rule base in the Sugeno model has rules of the form: If X is A1 and Y is B1 ; then f1 ¼ p1 X þ q1 Y þ r1 : ð31:1Þ If X is A2 and Y is B2 ;

then

f2 ¼ p2 X þ q 2 Y þ r 2 :

ð31:2Þ

X and Y are predefined membership functions, Ai and Bi are membership values, and pi, qi and ri are the consequent parameters. When we calculate the equation of first-order Sugeno model [8], the degree of membership variable of x1 in membership function of Ai are multiplied by the degree of membership variable of x2 and in membership function Bi, and the product is weight Wi. Finally, the weighted average of f1 and f2 is deemed the final output Z, which is calculated as Z¼

W1 f1 þ W2 f2 : W1 þ W2

ð31:3Þ

In the case of designing a fuzzy system for financial modelling, one should opt to use a model similar to Mamdani and Assilian [4], which is based on linguistic variables and linguistic output. Basically, fuzzy logic provides a reasoning-like mechanism that can be used for decision making. Combined with a neural network architecture, the resulting system is called a neuro-fuzzy system. Such systems are used for optimisation since they combine the reasoning mechanism that fuzzy logic offers together with the pattern recognition capabilities of neural networks, which will be discussed in the following.

31.2.2 Adaptive Neuro Fuzzy Inference System (ANFIS) The ANFIS is an adaptive network of nodes and directional links with associated learning rules [6]. The approach learns the rules and membership functions from the data [8]. It is called adaptive because some or all of the nodes have parameters that affect the output of the node. These networks identify and learn relationships between inputs and outputs, and have high learning capability and membership function definition properties. Although adaptive networks cover a

380

A. Kablan and W. L. Ng Layer 1 A1

Layer 2

Layer 3 w1

Layer 4 w1

Layer 5 w1 f 1

X A2 Σ

F

B1 Y

w2

w2

w2 f 2

B2

Fig. 31.1 ANFIS architecture for a two rule Sugeno system

number of different approaches, for our purposes, we will conduct a detailed investigation of the method proposed by Jang et al. [9] with the architecture shown in Fig. 31.1. The circular nodes have a fixed input–output relation, whereas the square nodes have parameters to be learnt. Typical fuzzy rules are defined as a conditional statement in the form: If X is A1 ;

then Y is B1

ð31:4Þ

If X is A2 ;

then Y is B2

ð31:5Þ

However, in ANFIS we use the 1st-order Takagi–Sugeno system [8] shown in Eq. 31.1 and 31.2. ANFIS can also be used to design forecasting systems [10]. We briefly discuss the five layers in the following: 1. The output of each node in Layer 1 is: O1;i ¼ lAi ðxÞ

for i ¼ 1; 2

O1;i ¼ lBi2 ðxÞ for i ¼ 3; 4 Hence, O1,i(x) is essentially the membership grade for x and y. Although the membership functions could be very flexible, experimental results lead to the conclusion that for the task of financial data training, the bell-shaped membership function is most appropriate (see, e.g., Abonyi et al. [11]). We calculate lA ðxÞ ¼

1 2bi ; xci 1 þ ai

ð31:6Þ

where ai,bi,ci are parameters to be learnt. These are the premise parameters. 2. In Layer 2, every node is fixed. This is where the t-norm is used to ‘‘AND’’ the membership grades, for example, the product: O2;i ¼ Wi ¼ lAi ðxÞlBi ðyÞ;

i ¼ 1; 2:

ð31:7Þ

31

Optimising Order Splitting and Execution

381

3. Layer 3 contains fixed nodes that calculate the ratio of the firing strengths of the rules: O3;i ¼ Wi ¼

Wi : W1 þ W 2

ð31:8Þ

4. The nodes in Layer 4 are adaptive and perform the consequent of the rules: O4;i ¼ Wi fi ¼ Wi ðpi x þ qi y þ ri Þ:

ð31:9Þ

The parameters (pi, qi, ri) in this layer are to be determined and are referred to as the consequent parameters. 5. In Layer 5, a single node computes the overall output: P X W i fi O5;i ¼ Wi fi ¼ Pi ð31:10Þ i Wi i This is how the input vector is typically fed through the network layer by layer. We then consider how the ANFIS learns the premise and consequent parameters for the membership functions and the rules in order to optimise these in the Fuzzy Logic Momentum Analysis System to produce a further improved system with a higher performance.

31.2.3 Fulmas for Trading Creating a fuzzy inference system to detect momentum is a complex task. The identification of various market conditions has been a topic subject to various theories and suggestions [12]. In the following, the proposed fuzzy inference system categorises the market conditions into seven categories based on price movement, using the current volume to determine the participation rates (PR) of the trading system each time. The participation rate is the amount of volume that will be traded at each instance. The first step in designing the Fuzzy Logic Momentum Analysis System involves defining the ‘‘market conditions’’ that the fuzzy system has to identify. The following seven market conditions are used to cover all possible movements of the price series: • • • • • • •

Rallying Strong up Slightly up Average Slightly down Strong down Crashing

382

A. Kablan and W. L. Ng

These conditions are considered as linguistic values for the fuzzy logic system, and they will be used to determine the current state of the price formation and its momentum. As momentum builds, the system considers the previous x amount of ticks and performs an inference procedure by adding all of the movements of the current price to the previous price in order to determine whether the general trend has been up or down after x points. Let Pi denote the current price and Pi-1 the previous price; ki is a fluctuating counter that goes up or down according to the movement of the price. Whenever the price goes up, it adds 1, and when the price goes down, it subtracts 1. Hence, this can be used to identify market conditions price movements, where if the market is moving strongly upwards, it will be detected by having more +1 than -1 or 0. This can be modelled as MomentumðxÞ ¼

x X

ki

ð31:11Þ

i¼1

where x is the number of ticks where we want to detect the momentum. For example, if we want to detect the momentum of the last 100 ticks, we count all up and down movements and then feed the resulting number to the fuzzy system, whose output would lie somewhere in the membership functions. The choice of triangular membership functions was made after using the expert based method, where it was suggested that triangular membership functions should be used due to their mathematical simplicity. Triangular shapes require three parameters and are made of a closed interval and a kernel comprised of a singleton. This simplifies the choice of placing the membership functions. The expert merely has to choose the central value and the curve slope on either side. The same procedure is applied for calculating the linguistic variable ‘‘volatility’’, where the linguistic values are: • • • • •

Very high High Medium Low Very low

The fuzzy logic system considers both market momentum and volatility. It generates the rules and then takes a decision based on the amount of market participation. This is illustrated in Fig. 31.2.

31.3 Empirical Analysis Experiments in this study have been carried out on high-frequency tick data obtained from ICAP plc of both Vodafone Group plc (VOD) and Nokia Corporation (NOK). A very important characteristic of this type of data is that it is

31

Optimising Order Splitting and Execution

383

Fig. 31.2 Extracting fuzzy rules from both volatility and momentum

irregularly spaced in time, which means that the price observations (ticks) are taken in real-time (as they arrive). The application is designed for an interdealer broker,1 which means that they have the ability to create orders with any amount of volume. For both stocks, 2 months of high-frequency tick data between 2 January 2009 and 27 February 2009 has been obtained, simulations are terminated whenever 1 million shares have been bought or sold. The fuzzy logic system receives the first batch of data and performs all of the buy or sell actions on it. The same procedure is repeated using the standard volume-based system. Finally, the performance of both systems is compared. It must be mentioned that 2 months of high-frequency tick data is a significantly large amount of data; considering every iteration, the system analyses the momentum of the past 100 ticks (Fig. 31.3).

1

An interdealer broker is a member of a major stock exchange who is permitted to deal with market makers, rather than the public, and can sometimes act as a market maker.

384

A. Kablan and W. L. Ng VOD

NOK 100

12.5 12

145

11.5 11

140

Price

Price

10.5 10

135

9.5 130

9 8.5

125

8 120

7.5 2-JAN-2009

17-JAN-2009

1-Feb-2009

15_FEB-2009

2-Mar-2009

2-JAN-2009

17-JAN-2009

1-FEB-2009

15-FEB-2009 2-MAR-2009

Fig. 31.3 Time series data of NOK and VOD prices

31.3.1 Standard Volume System (SVS) A standard brokerage and trading mechanism for executing large orders is a simple volume-based system that parses the volume being traded whenever a certain number of shares (a threshold) have been traded; the system will buy or sell (depending on the order) a certain percentage. If there is an order to trade one million shares of a certain stock, the threshold could be, for example, 10,000 shares. Whenever 10,000 shares have been traded and if the participation rate PR is set to 25%, the system will buy or sell 25% of the average volume. If the accumulated sum of the volume exceeds the predefined threshold, then the amount of shares traded is equal to the PR multiplied by the current volume: Total SVS Cost ¼

n X

pricei ðamount of sharesi Þ

i¼1

where n is the number of operations required to reach the target order (for example, 1 million shares). The above system has proven to be efficient and is being adopted by many financial brokerage and execution institutions [13].

31.3.2 Benchmark Performance Measures Although many systems have used many different approaches such quantum modelling to determine the various participation rates (PR), they usually fail to outperform the standard volume system in the long term. The aim of this study is to prove that FULMAS outperforms this type of system in the long run, this is assessed using order execution costs for buy and sell orders. In particular, FULMAS will be applied to determine the PR in the market according to the current momentum. For example, for a buy order, it is preferable to increase the PR (number of shares bought at that time) when the price is low and to decrease the participation when the price is high. The idea here is to use the momentum analysis system to identify in what market condition we are currently residing in.

31

Optimising Order Splitting and Execution

Table 31.1 Participation rates for buy side and the sell side of FULMAS Rallying Strong up Slightly up Average Slightly down Strong down Crashing

385 Buying participation rates (%)

Selling participation rates (%)

10 15 20 25 30 35 40

40 35 30 25 20 15 10

This will enable us to vary the PR, providing a trading advantage, since the system can trade aggressively when the condition is at an extreme. It would also minimise its trading when the condition is at another extreme. In other words, if we are selling 1 million shares, the system will make a trade whenever the threshold of volume has been exceeded. However, if the current market condition indicates that the price is very high or rallying, then we know that this is a suitable time to sell a lot of shares, for example, 40% of the current volume. The same concept applies when the momentum indicates that the price is strong down, which means that the system should sell a lower volume at this low price, for example, 15%. The reverse mechanism applies for buying shares. When the market is crashing, this is a good indicator that we should buy a large volume (40%), and when the price is at an average point, it would behave like the SVS system, i.e., buying 25% of the volume. This is shown in Table 31.1. The same procedure is applied to volatility and then combined with volume to produce the fuzzy rules. When implementing SVS and FULMAS, the benchmark at which both systems will be compared against each other will be the outperformance of FULMAS on the SVS, expressed in basis points (one hundredth of 1%). To calculate the improvement (imp) for the buy and sell sides, the following formulas are used: FULMAS price impBuy ¼ 1 104 bps SVS price FULMAS price impSell ¼ 1 104 bps SVS price where FULMAS price is the total cost of buying x amount of shares using FULMAS, and SVS price is the total cost of buying the same number of shares using the traditional SVS.

31.3.3 Results The complimentary characteristics of neural networks and fuzzy inference systems have been recognised and the methodologies have been combined to create neuro-

A. Kablan and W. L. Ng Degree of membership

386 Initial MFs Crashing

1

StrongDown

SlightlyDown

Average

SlightlyUp

StrongUp

Rallying

0.8 0.6 0.4 0.2 0 0

10

20

30

40

50

60

70

80

90

100

Degree of membership

input2 Final MFs Crashing

1

StrongDown

SlightlyDown

Average

SlightlyUp

StrongUp

Rallying

0.8 0.6 0.4 0.2 0 0

10

20

30

40

50

60

70

80

90

100

input2

Fig. 31.4 Triangular membership functions optimised using ANFIS

fuzzy techniques. Indeed, earlier work by Wong and Wang [14] described an artificial neural network with processing elements that could handle fuzzy logic and probabilistic information, although the preliminary results were less than satisfactory. In this study, ANFIS is used to optimise the membership functions in FULMAS. This is performed by feeding the ANFIS system both the training data, the desired output, and tuning the ANFIS in order to reach the target result by modifying the membership functions (see Figs. 31.4 and 31.5). In other words, at each instance, ANFIS is fed the results currently obtained from the fuzzy system together with a set of target prices or data. This target price will be an optimal price that is far better than the current one (a cheaper price if on buy mode or a higher price if in sell mode). The system runs and modifies the membership

Degree of membership

Initial MFs Crashing

1 08 06 04 02

StrongDown

SightlyDown

Average

lightlyUp

StrongUp

Rallying

0 0

10

20

30

40

50

60

70

80

90

100

Degree of membership

input Final MFs 1 Crashing 0.8 0.6 0.4 0.2 0 0

StrongDown

10

20

SlightlyDown

30

40

Average

50

SlightlyUp

60

70

input

Fig. 31.5 Bell-shaped membership functions optimised using ANFIS

StrongUp

80

Rallying

90

100

31

Optimising Order Splitting and Execution

387

Table 31.2 Analysis of results of buying and selling 1 million shares of NOK and VOD with the descriptive statistics of the improvement indicators (in bps per trade) Mean Median Std dev Skewness Kurtosis Initial results Buying NOK Buying VOD Selling NOK Selling VOD Optimised results Buying NOK Buying VOD Selling NOK Selling VOD

2.98 12.48 1.68 2.73

4.63 1.58 2.92 2.46

12.39 36.25 8.79 27.71

-0.05 1.74 -1.43 0.70

2.56 4.86 6.25 8.84

6.94 14.48 9.36 7.71

6.57 4.33 5.79 6.91

12.99 2.95 9.18 28.23

0.15 -0.74 -0.52 0.86

2.53 3.28 2.61 9.38

functions in each epoch in order to get as close to the optimal price as possible. Comparing the results of both optimised membership functions, an improvement in the original system was discovered. The optimised triangular membership functions have also outperformed the optimised bell-shaped membership functions; this confirms the experts’ opinion mentioned above concerning the choice of the triangular membership functions. Table 31.2 displays the improvement of FULMAS against SVS, showing the descriptive statistics of the improvement rate of buying or selling one million. This improvement rate can be either positive, when FULMAS has outperformed SVS, or negative, when FULMAS was outperformed by SVS. In particular, we see a much higher outperformance than in the previous system, which confirms that the use of ANFIS to optimise the membership functions has increased the performance of the system on both the buy and sell sides. For example, Table 31.2 shows that on the buying side, the system, on average, outperforms the standard system by more than six basis points. On an industrial scale, this means a large amount of savings for financial institutions that employ such systems to vary the participation rates. Other descriptive statistics such as the standard deviation, skewness and kurtosis are also included. These imply that the outperformance of FULMAS over SVS is actually considerable given the higher values of the median. Also, the skewness is closer to zero, and the kurtosis has decreased in most cases, both implying a higher accuracy of the improved system.

31.4 Summary and Discussion It is well known that a main inadequacy of economic theory is that it postulates exact functional relationships between variables. In empirical financial analysis, data points rarely lie exactly on straight lines or smooth functions. Ormerod [15] suggests that attempting to accommodate these nonlinear phenomena will introduce an unacceptable level of instability in models. As a result of this

388

A. Kablan and W. L. Ng

intractability, researchers and investors are turning to artificial intelligence techniques to better inform their models, creating decision support systems that can help a human user better understand complex financial systems such as stock markets. Artificial intelligence systems in portfolio selection have been shown to have a performance edge over the human portfolio manager and recent research suggests that approaches incorporating artificial intelligence techniques are also likely to outperform classical financial models [16]. This study has introduced a system that utilises fuzzy logic in order to justify the current market condition that is produced by the accumulation of momentum. FULMAS is a fuzzy logic momentum analysis system that outperforms the traditional systems used in industry, which are often based on executing orders dependent on the weighted average of the current volume. Results of the implemented system have been displayed and compared against the traditional system. The system proves that, on average, it increases profitability on orders on both the buy and sell sides. FULMAS has been improved further by using ANFIS as an optimisation tool and the new results have shown a significant improvement over both the original FULMAS system and the SVS system. Acknowledgments The authors would like to thank Mr. Phil Hodey, the head of portfolio management and electronic trading at ICAP plc for providing the tick data used in the simulations of the system and for his invaluable support and guidance.

References 1. Ellul A, Holden CW, Jain P, Jennings RH (2007) Order dynamics: recent evidence from the NYSE. J Empirical Finance 14(5):636–661 2. Chu HH, Chen TL, Cheng CH, Huang CC (2009) Fuzzy dual-factor time-series for stock index forecasting. Expert Syst Appl 36(1):165–171 3. Dourra H, Siy P (2002) Investment using technical analysis and fuzzy logic. Fuzzy Sets Syst 127(2):221–240 4. Mamdani E, Assilian S (1975) An experiment in linguistic synthesis with a fuzzy logic controller. Int J Man Mach Stud 7(1):1–13 5. Kablan A, Ng WL (2010) High frequency trading using fuzzy momentum analysis. In: Lecture notes in engineering and computer science: proceedings of the world congress on engineering 2010, WCE 2010, vol I, 30 June–2 July, London, UK, pp 352–357 6. Jang JR (1993) ANFIS: adaptive network-based fuzzy inference system. IEEE Trans Syst Man Cybern 23(3):665–685 7. Dimitrov V, Korotkich V (2002) Fuzzy logic: a framework for the new millennium, studies in fuzziness and soft computing, vol 81. Springer, New York 8. Takagi T, Sugeno M (1985) Fuzzy identification of systems and its application to modeling and control. IEEE Trans Syst Man Cybern 15(1):116–132 9. Jang JR, Sun CT, Mizutani E (1997) Neuro-fuzzy and soft computing. Prentice Hall, Upper Saddle River 10. Atsalakis GS, Valavanis KP (2009) Forecasting stock market short-term trends using a neurofuzzy based methodology. Expert Syst Appl 36(7):10696–10707 11. Abonyi J, Babuska R, Szeifert F (2001) Fuzzy modeling with multivariate membership functions: gray box identification and control design. IEEE Trans Syst Man Cybern B 31(5):755–767

31

Optimising Order Splitting and Execution

389

12. Griffin J (2007) Do investors trade more when stocks have performed well? Evidence from 46 countries. Rev Financ Stud 20(3):905–951 13. Goldstein MA, Irvine P, Kandel E, Wiener Z (2009) Brokerage commissions and institutional trading patterns. Rev Financ Stud 22(12):5175–5212 14. Wong FS, Wang PZ (1990) A stock selection strategy using fuzzy neural networks. Neurocomputing 2(5):233–242 15. Ormerod P (2000) Butterfly economics: a new general theory of social and economic behaviour. Pantheon, New York 16. Brabazon A, O’Neill M, Maringer D (2010) Natural computing in computational finance, vol 3. Springer, Berlin

Chapter 32

The Determination of a Dynamic Cut-Off Grade for the Mining Industry P. V. Johnson, G. W. Evatt, P. W. Duck and S. D. Howell

Abstract Prior to extraction from a mine, a pit is usually divided up into 3-D ‘blocks’ which contain varying levels of estimated ore-grades. From these, the order (or ‘pathway’) of extraction is decided, and this order of extraction can remain unchanged for several years. However, because commodity prices are uncertain, once each block is extracted from the mine, the company must decide in real-time whether the ore grade is high enough to warrant processing the block further in readiness for sale, or simply to waste the block. This paper first shows how the optimal cut-off ore grade—the level below which a block should be wasted—is not simply a function of the current commodity price and the ore grade, but also a function of the ore-grades of subsequent blocks, the costs of processing, and the bounds on the rates of processing and extraction. Secondly, the paper applies a stochastic price uncertainty, and shows how to derive an efficient mathematical algorithm to calculate and operate a dynamic optimal cut-off grade criterion throughout the extraction process, allowing the mine operator to respond to future market movements. The model is applied to a real mine composed of some 60,000 blocks, and shows that an extra 10% of value can be created by implementing such an optimal regime.

P. V. Johnson (&) G. W. Evatt P. W. Duck School of Mathematics, University of Manchester, Manchester, UK e-mail: [email protected] G. W. Evatt e-mail: [email protected] P. W. Duck e-mail: [email protected] S. D. Howell Manchester Business School, University of Manchester, Manchester, UK e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_32, Ó Springer Science+Business Media B.V. 2011

391

392

P. V. Johnson et al.

32.1 Introduction Mineral mining is a complex engineering operation, which can last for several decades. As such, significant consideration must be given to the planning and design of the operation, so that numerous engineering constraints can be met, whilst making sure the operation is economically viable. To compound the difficulty of the task, the planning and scheduling of extraction from a mine is made in the presence of uncertainties, such as the future commodity price and estimated ore-grade. These uncertainties can fluctuate on a daily basis, highlighting the different timescales upon which the mining company must base decisions: the shorter time scales governed by commodity price and realised ore-grade, and the longer time-scales governed by (amongst other things) extraction rates and processing capacities. The focus of this paper is upon one of these short time-scale decisions: whether to process the extracted material, or to waste it. The level of ore-grade which separates this decision is known as the ‘cut-off grade’ [7]. Prior to extraction, the planning of the extraction schedule begins with deciding an appropriate pathway (or order) through the mine. Whilst it is possible to alter the order of extraction at various points during extraction, it is generally not a particularly flexible decision, as changing an order can require moving extraction machinery, processing units, the cancellation of contracts and large overhead costs. As such, it is reasonable to assume that the pathway through the mine is fixed, but it is how one progresses, and operates, along that pathway that is variable. At this planning stage, the mine is graphically divided up into 3-D blocks, each containing its own estimated quantity of ore. The estimated ore-grade carries with it an associated uncertainty, which can have an effect upon the valuation of a mining operation [6]. However, it is the expected (estimated) ore grade level which dominates the planning of the actual pathway through the mine, as this is the best-guess in deciding the order in which the resource should be extracted. The extraction pathway is most commonly decided using software such as the Gemcom-Whittle package [15], which allows companies to construct feasible pit shapes that satisfies slope constraints on the angle of the pit, transportation needs and work-force limitations. As previously mentioned, this algorithm may be used several times throughout a mine’s life, so as to ensure the mine plan is consistent with market conditions, however on a day-to-day basis the mine must take more detailed scheduling decisions in real-time. The key real-time decision is whether or not to process the latest extracted block (e.g. by milling or electrolysis) in readiness for sale, where the block’s intrinsic value varies with its ore grade and with the underlying commodity price. We define a ‘cost-effective’ block as one whose ore grade is high enough to pay the cash costs of processing, at the current price. However the cut-off ore grade— above which a block should be processed—need not be set as low as the grade above which the block will be cost-effective to process. Disparity between the rate of extraction and the maximum processing capacity means that there can be an opportunity cost to processing all cost-effective material, since the small

32

The Determination of a Dynamic Cut-Off Grade

393

short-term gain of processing a low grade block could be surpassed by bringing forward the processing of more valuable blocks instead. The optimal wasting of potentially cost-effective material is the focus of this paper. To highlight the above point, let us consider a trivial case where the mine has a stock of 3 blocks awaiting processing, extracted in order, A; B and C; whose current market values after processing costs are VA ¼ $1; VB ¼ $50; and VC ¼ $1; 000: Whilst, classically, analysis has often been indifferent to the order of processing, with enough discounting applied one can see that by an optimal cut-off criterion, it would be best to simply waste A and get on with processing B and C: This is because the value gained in processing A is less than the time value of money lost in waiting to process B and C at a later date. This lack of consideration of the discount rate has been highlighted before as a drawback in current mine planning [14] but, as yet, little progress has been made with it. Another consequence of an optimal cut-off grade decision is having to increase the rate of extraction of poor quality ores to keep the processing plant loaded. This is because a processing unit will typically operate at a fixed capacity, and closing (or restarting) it is a costly and undesirable operation. As such, a maximum (and minimum) possible extraction rate must be known. This clearly illustrates the link between extraction rate and the optimal cut-off grade. With this maximum possible extraction rate, one knows precisely which blocks can possibly be extracted within each period in time, and thus the decision as to which block to process next can be decided. There have been several other approaches to mine valuation and the corresponding extraction regime. Typically these have relied upon simulation methods to capture the uncertainty of price and ore-grade [8, 9, 12]. These types of method can be extremely time consuming, with computing times of several hours [3], and can often lead to sub-optimal and incomplete results. Using these simulation techniques, optimal cut-off grades were investigated by Menabde et al. [10], although little insight into the core dynamics, performance or robustness was obtained. A similar approach is the use of genetic algorithms—a general technique commonly used by computer scientists—which are capable of calculating mine schedules whilst adhering to specified constraints upon their design [11]. Whilst the work of Myburgh and Deb [11] was suitable in calculating feasible paths, the criteria by which this particular study operated was, again, not given, and the computing time was also of the order of hours. To make a step-change away from these methods, partial differential equations (PDEs) can be implemented to capture the full mine optimisation process, which builds on work by Brennan and Schwartz [2] and Chen and Forsyth [4]. The inclusion of stochastic ore-grade uncertainty, via PDEs has also been tested by Evatt et al. [6], which enabled mine valuations to be produced in under 10 s and showed that the effect on mine value of stochastic ore-grade variation is much less than the effect of stochastic price. Whilst the mathematics and numerics of this PDE approach are relatively complex at the outset, once solved, they produce highly accurate results in short times—complete with model input sensitivities. This paper extends the use of PDEs, adding a model for tactical processing

394

P. V. Johnson et al.

decisions under foreseeable variations in ore grade and unforeseeable fluctuations in price. This shows that when processing capacity is constrained, the ability to maximise the value of processing by varying the cut-off ore grade can add significantly to mine value when optimally applied. By solving rapidly under a range of processing constraints, the scale of the processing plant can itself be optimised. In Sect. 32.2 we demonstrate the underlying concepts determining the optimal cut-off decision rule, and in Sect. 32.3 we apply a price uncertainty to the model and use a contingent claims approach to derive the governing equation. We then apply the model to a mine composed of some 60,000 blocks in Sect. 32.4, to show how much extra value the running of an optimal cut-off grade regime can add to a valuation. We draw together our concluding remarks in Sect. 32.5.

32.2 Cut-Off Grade Optimisation The selection of the cut-off grade criteria reduces to whether a cost-effective block should be processed or not. This is because there is the possibility a more valuable block could be brought forward in time to be processed, which otherwise would loose more time-value of money than the value gained from processing the first block. To highlight this point let us consider the order of extracted of blocks from a mine, which we (hypothetically) place in a chronologically ordered row. As we operate the processing unit of the mine, we must pass along this row and decide which blocks to process and which blocks to waste. In reality, although we know the (estimated) ore-grades of the blocks in advance, until we know for certain the market price at the time of processing we cannot know what cashflow it will generate. Yet even if we assume a constant price, we can still show how dynamic cut-off grade decision making is still required and optimal. Consider a highly simplified mine, as shown in Fig. 32.1, which is composed of just two blocks, Block1 and Block2, with ore grades G1 and G2 ; respectively. We allow the mine to have the capacity within the rate of extraction to immediately process either the first block, Block1, or its successor, Block2. As such, the comparison is between the value of processing both blocks in order, given by V12 ; or the value of only processing Block2, V2 : With a constant price, S; we can write down the net present value of these two (already extracted) blocks, where we shall process both, V12 ¼ ðSG1 P Þ þ ðSG2 P Þerdt :

ð32:1Þ

Here dt is time it takes to process each block, P is the cost of processing each block and the discount rate is r: This value must be compared to the decision to waste the first block and process only the second block, which would have value, V2 ¼ ðSG2 P Þ:

ð32:2Þ

32

The Determination of a Dynamic Cut-Off Grade

Fig. 32.1 Two examples of how price may effect the order in which blocks are processed so as to maximise a mines NPV. Example A is made with a low commodity price, S ¼ $1; 000 kg1 ; and Example B is made with a high commodity price, S ¼ $10; 000 kg1

395

Block1 Block2 10kg

1000kg

Direction of Extraction

Potential Block Values

Example A)

$9,900

$989,950 NPV = $999,850

Waste

$999,900 NPV = $ 999,900

S=$1,000 per kg

Potential Block Values

Example B)

$99,990 $9,900,040 NPV = $10,000,030

S=$10,000 per kg Waste

$9,999,900 NPV = $9,999,900

This comparison between V12 and V2 is one the algorithm must continually make. To demonstrate how the selection depends upon the underlying price, Fig. 32.1 shows the choices available for two different commodity prices, one high (S ¼ $10; 000 kg1 ) and one low (S ¼ $1; 000 kg1 ). These are made with prescribed parameter values r ¼ 10%;

P ¼ $100 block1 ;

dt ¼ 0:1 year:

ð32:3Þ

As can be seen, in the low-price case, Example A, it is best to process only the second block. However, in the high commodity price case, namely Example B, it is best to process both blocks. This simple example demonstrates (albeit with rather exaggerated parameter values) how the selection needs to be actively taken, and how different values of the underlying price, and discount rate, will affect the optimal cut-off decision. Another consequence of this optimal decision taking is that the mine will be exhausted earlier than might have been previously thought, since we wasted the first block and only processed the second, hence a mine owner could agree a shorter lease on this particular mine.

32.3 Model Construction To create the framework for determining an optimal dynamic cut-off grade, we can make use of two distinct methods for arriving at the core equation describing the valuation, V: The first method follows a contingents claims approach, in which the

396

P. V. Johnson et al.

uncertainty arising from the underlying price is removed by hedging away the risk via short-selling suitable quantities of the underlying resource. The second method follows the Feynman–Kac probabilistic method, as described in relation to the mining industry by Evatt et al. [5], which is the chosen method for deriving a valuation when hedging is not undertaken. This second method is also permissible when hedging does take place but a slight adjustment to the price process is required, and explained within this latter paper. Because Evatt et al. [5] already covers the derivation of the mine valuation, in the present paper we explain how the contingent claims approach can be used. We first prescribe three state-space variables; these are the price per unit of the underlying resource in the ore S; the remaining amount of ore within the mine Q and time t: We next need to define the underlying price uncertainty process, which we assume to follow a geometric Brownian motion, dS ¼ lS dt þ rs S dXs ;

ð32:4Þ

where l is the drift, rs the volatility of S and the random variable dXs ; is a standard Wiener process. We use this price process without loss of generality, since other price processes (such as mean-reverting Brownian motion) can easily be implemented by the techniques described here. Using the contingent claims approach (see [16]) and the above notation, we may apply Ito’s lemma to write an incremental diffusive change in V as oV oV oV 1 2 o2 V oV dV ¼ rs dXs þ þl dt; dQ þ þ r oS oQ ot 2 s oS2 oS

ð32:5Þ

where we have taken powers of ðdtÞ2 and ðdQÞ2 to be negligible. We are able to remove the dQ term via the relationship between Q and t by specifying the rate of extraction, qe ; namely, dQ ¼ qe dt;

ð32:6Þ

where qe can be a function of all three variables, if required. This extraction rate is the function we wish to determine in our optimal cut-off regime, as it governs both how we progress through the mine and, as a consequence, which blocks we choose to waste. The rate of extraction will obviously have limitations on its operating capacity, qe 2 ½0; qmax ; which itself could be a function of time. The rate of extraction is closely linked to the rate of processing, which should be kept at a fixed constant, qp : Hence qmax must be big enough for the processing unit to always operate at its constant capacity, qp ; i.e. there must always be enough costeffective ore-bearing material being extracted from the mine so as to meet the processing capacity. Optimal variation in the extraction rate has already been shown to produce improved valuations [7], although this was achieved without considering processing limitations or grade variation.

32

The Determination of a Dynamic Cut-Off Grade

With this relationship, (32.6), Eq. 32.5 can be transformed into oV oV oV 1 o2 V oV dV ¼ r1 dXs þ dt: qe þ r2s 2 þ l oS ot oQ 2 oS oS

397

ð32:7Þ

To follow the conventional approach in creating and valuing risk-free portfolios we construct a portfolio, P; in which we are instantaneously long in (owning) the mine and short in (owing) cs amounts of commodity contracts. This defines P ¼ V cs S; such that, dP ¼ dV cs dS:

ð32:8Þ

This portfolio is designed to contain enough freedom in cs to be able to continually hedge away the uncertainty of dXs ; which is the standard approach in creating risk-free portfolios [1, 13]. It also implies that within a small time increment, dt; the value of P will increase by the risk-free rate of interest, minus any economic value generated and paid out by the mine during the increment. This economic value is typically composed of two parts, the first, negative, being the cost to extract, qe M ; and the second, positive, the cash generated by selling the resource content of the ore processed, qp ðSG P Þ: Here M is the cost of extraction per ore tonne, P is the processing cost per ore tonne, and G is the oregrade (weight of commodity per ore tonne). The reason why the economic functions contain the factors qe or qp is that we wish to maximise value by varying qe in real time, so as to maintain qp at its fixed bound. In turning the discrete block model into a continuous function describing the ore grade, G; we have assumed that blocks are small enough that they can be approximated as infinitesimal increments of volume. As discussed in Sect. 32.2, the decision whether to process or waste the next block must be optimised. Before or after optimisation the incremental change in portfolio value may be written as dP ¼ rP dt cS dS dt qp ðGS P Þ dt qe M dt:

ð32:9Þ

By setting the appropriate value of cs to be oV ; oS and substituting Eqs. (32.4), (32.7) and (32.8) into (32.9), we may write our mine valuation equation as cs ¼

1 2 2 o2 V oV oV oV þ r S qe þ ðr dÞS 2 s oS2 ot oQ oS rV þ qp ðGS P Þ qe M ¼ 0:

ð32:10Þ

This is of the same form as that derived by Brennan and Schwartz [2], except that they added taxation terms, but did not model processing constraints or variations of ore grade.

398

P. V. Johnson et al.

We next need to prescribe boundary conditions for (32.10). The boundary condition that no more profit is possible occurs either when the reserve is exhausted Q ¼ 0; or when a lease to operate the mine has reached its expiry date t ¼ T; hence: V ¼ 0 on Q ¼ 0

and/or

t ¼ T:

ð32:11Þ

Since the extraction rate will have a physical upper bound, the extraction rate and cost will not vary with S when S is large. This permits a far field condition of the form oV ! AðQ; tÞ oS

as S ! 1:

ð32:12Þ

When the underlying resource price is zero we need only solve the reduced form of Eq. 32.10 with S ¼ 0; which reduces to V¼e

rt

ZT

qe M ðzÞerz dz:

ð32:13Þ

0

This completes the determination of our core equation, and its boundary conditions. We can now define the optimising problem which we wish to solve: we must determine the optimal extraction rate, qe ; at every point in the state space which maximises the value V; which satisfies Eq. 32.10, with qe ¼ qe ; subject to the defined boundary conditions. Problems of this type may be solved numerically using finite-difference methods, in particular the semi-Lagrangian numerical technique (see [4] for further details). All results in this paper have been thoroughly tested for numerical convergence and stability. We must now show how the optimal q and its corresponding cut-off grade is to be incorporated into the maximisation procedure.

32.4 Example Valuation We now apply our optimal cut-off grade model to a real mine of some 60,000 blocks, whose block by block ore-grade and sequence of extraction were supplied by Gemcom Software International. This mine has an initial capital expenditure of some $250m. We were also supplied with a fixed reference price Sref ; for us to compare valuations with. We ourselves assumed a maximum extraction rate of five times the processing rate, which is broadly realistic, and it restricts the mine to wasting no more than 80% of any section of costeffective ore (if one can increase the extraction rate fivefold, then it is possible to waste four blocks and process the fifth). The other parameter values we were supplied are

32

The Determination of a Dynamic Cut-Off Grade

r ¼ 10% year1 ;

1

d ¼ 10% year1 ;

Sref ¼ $11; 800 kg1 ;

399

rs ¼ 30% year2 ;

P ¼ $4 tonne1 ;

Qmax ¼ 305; 000; 000 tonnes;

e ¼ $1 tonne1 ;

qp ¼ 20; 000; 000 tonnes year1 : ð32:14Þ

Whilst the ore-grade is quite volatile, it was shown in Evatt et al. [6] that a suitable average of the estimated grade quality could be used without any sizeable alteration in the valuation, as one would expect, since the same volume of ore is available sold whether one takes average values or not. Using this average, Fig. 32.2 shows the economic worth throughout extraction for each part of the mine, where we have assumed the price to remain at its prescribed reference price, Sref G P : This highlights how the grade varies through the extraction process, and it is with reference to this grade variation that we shall compare the regions where it is optimal to speed up extraction and consequently waste certain parts of the ore body.

32.4.1 Results For the example mine, we first calculate and compare two different valuations made with and without the optimal cut-off criterion. Figure 32.3 shows two sets of valuations: the lower pair (straight lines; one dashed, one solid) shows the valuations made assuming a constant price (rs ¼ 0%), and the upper pair (curved lines; one dashed, one solid) shows the effect of including both price uncertainty (rs ¼ 30%) and the option to abandon the mine when the valuation becomes negative—which is a standard option to include in a reserve valuation [2]. In each pair of lines the lower, dotted lines show valuation without a cut-off regime, and the higher, solid lines show valuation with the optimal cut-off regime. The optimal cut-off regime increases the mine valuation by up to 10%, with increasing benefit

12

Average Standardised Grade

Fig. 32.2 Given a block ordering in the mine, the average standardised grade value is the cash value of ore (against reference price) minus processing costs per tonne of ore. This data was supplied by Gemcom Software International

8

4

0

0

25

50

% of ore tonnes remaining

75

100

400

P. V. Johnson et al.

at higher prices. This may seem surprising, but although the mine is always more profitable at higher prices, the opportunity cost of not allocating the finite processing capacity to the best available block does itself grow. An obvious question which arises from this analysis is how do we decide which ore-grades we should waste, and what is the corresponding rate of extraction to achieve this? Given the mine operator will know at each point in time what the current underlying price is, they can look at the corresponding slice through the 3-D surface of the optimal cut-off grade, and see for which regions in t and Q they would waste ore and increase the rate of extraction. With this we can refer back to the corresponding grade of Fig. 32.2 and easily calculate what these grades actually are. For example, by looking at the closed regions of Fig. 32.4 we can see the optimal cut-off grades for two different commodity prices, S ¼ 100% (top) and S ¼ 200% (bottom) of the reference price. The points at where it is optimal to increase the rate of extraction is given by the segments where the closed regions (bounded by the thin line) intersect with the optimal extraction trajectory (bold line). In the two examples of Fig. 32.4, both appear to correspond to a standardised cut-off grade (Fig. 32.2) of around 2 units. The optimal rate of extraction is given by the gradient of the bold line, where the trajectory is calculated by integrating (32.6) for a given extraction regime. The difference between the dotted line (trajectory for the no cut-off situation), and the thick straight line of the optimal cut-off regime therefore gives an indication of the total amount of ore wasted. Finally, Fig. 32.5 shows how the NPV depends upon the expected expiry time for extraction if one operates an optimal cut-off regime (solid line) or not (dotted line). If the mine chooses the optimal regime, the maximum NPV occurs just after 14 years, as opposed to the life of the mine being maximal at mine exhaustion at 15 years (as it is with no cut-off). This is a consequence of an optimal cut-off grade regime, in which the mine will occasionally increase its extraction rate from the (originally) planned level due to market fluctuations, thereby reaching the final pit shape in a shorter time. 12.5 10

NPV [$100m]

Fig. 32.3 NPV of the mine against percentage of reference price for two different sets of valuations. The two lower lines (straight lines; one dashed, one solid) are for a constant price while the two upper lines (curved lines; one dashed, one solid) include price volatility and the abandonment option. NPV for the optimal cut-off regime is shown by solid lines, and no cut-off by dashed lines

7.5 5 2.5 0 -2.5 50

75

100

125

% of base commodity price

150

The Determination of a Dynamic Cut-Off Grade

401

12

t

Fig. 32.4 Graphs showing the optimal cut-off regions for an extraction project for two different price levels, medium (top), and high (bottom). The closed regions contained within the thin solid lines show where ore is wasted and the extraction rate is increased. The dashed line represents the one realisation of a trajectory followed with no cut-off, while the thick solid line represents the realisation of the trajectory followed with optimal cut-off

Time Remaining, T

32

8

4

0

0

25

50

75

100

% of ore tonnes remaining

Time Remaining, T

t

12

8

4

0

0

25

50

75

100

% of ore tonnes remaining

5

NPV [$100m]

Fig. 32.5 The NPV of the mine against time remaining on the option on the mine given that 100% of the mine is present. The solid line is with optimal cut-off, dashed without

2.5

0

-2.5

0

4

8 Expiry Date, T

12

402

P. V. Johnson et al.

32.5 Conclusions This paper has shown how to solve and optimise a (relatively) short time-scale mining problem, known as a dynamic cut-off grade, which is the continuous decision of whether to process extracted ore or not. This was achieved in the presence of price uncertainty. We have described how the partial differential equation model can be derived via two distinct methods, either by a contingent claims approach, when continuous hedging is present, or by the Feynman–Kac method. Using this model, we have shown how to determine and operate a optimal dynamic cut-off grade regime. As such, we have valued the ‘option’ to process or not to process under uncertainty, allowing the mine owner to react to future market conditions. With our given example, the option adds around 10% to the expected NPV of an actual mine of 60,000 blocks. One natural extension of this work will be to allow for the cut-off grade to remain fixed for discrete periods of time, thus allowing mine operators to not have to continually alter their rate of extraction due to market changes.

References 1. Black F (1976) The pricing of commodity contracts. J Financial Econ 3:167–179 2. Brennan MJ, Schwartz ES (1985) Evaluating natural resource investments. J Business 58(2):135–157 3. Caccetta L, Hill SP (2003) An application of branch and cut to open pit mine scheduling. J Global Optim 27:349–365 4. Chen Z, Forsyth PA (2007) A semi-Lagrangian approach for natural gas storage valuation and optimal operation. SIAM J Sci Comput 30(1):339–368 5. Evatt GW, Johnson PV, Duck PW, Howell SD, Moriarty J (2010) The expected lifetime of an extraction project. In: Proceedings of the Royal Society A, Firstcite. doi:10.1098/rspa. 2010.0247 6. Evatt GW, Johnson PV, Duck PW, Howell SD (2010) Mine valuations in the presence of a stochastic ore-grade. In: Lecture notes in engineering and computer science: proceedings of the World Congress on engineering 2010, vol III, WCE 2010, 30 June–2 July, 2010, London, UK, pp 1811–1866 7. Johnson PV, Evatt GW, Duck PW, Howell SD (2010) The derivation and impact of an optimal cut-off grade regime upon mine valuation. In: Lecture notes in engineering and computer science: proceedings of the World Congress on engineering 2010, WCE 2010, 30 June–2 July, 2010, London, UK, pp 358–364 8. Jewbali A, Dimitrakopoulos R (2009) Stochastic mine planning—example and value from integrating long- and short-term mine planning through simulated grade control. Orebody modelling and strategic mine planning, 2nd edn. The Australasian Institute of Mining and Metallurgy, Melbourne, pp 327–333 9. Martinez LA (2009) Designing, planning and evaluating a gold mine project under in-situ metal grade and metal price uncertainties. Orebody modelling and strategic mine planning, 2nd edn. The Australasian Institute of Mining and Metallurgy, Melbourne, pp 225–234 10. Menabde M, Foyland G, Stone P, Yeates GA (2004) Mining schedule optimisation for conditionally simulated orebodies. In: Proceedings of the international symposium on orebody modelling and strategic mine planning: uncertainty and risk management, pp 347–52

32

The Determination of a Dynamic Cut-Off Grade

403

11. Myburgh C, Deb K (2010) Evolutionary algorithms in large-scale open pit mine scheduling. In: Proceedings of the 12th annual conference on genetic and evolutionary computation, pp 1155–1162 12. Ramazan S, Dimitrakopoulos R (2007) Stochastic optimisation of long-term production scheduling for open pit mines with a new integer programming formulation. Orebody modelling and strategic mine planning. The Australasian Institute of Mining and Metallurgy, Melbourne, pp 385–391 13. Schwartz ES (1997) The stochastic behavior of commodity prices: implications for valuation and hedging. J Finance LII(3):923–973 14. Tolwinski B, Underwood R (1996) A scheduling algorithm for open pit mines. IMA J Math Appl Bus Ind 7:247–270 15. Whittle D, Cahill J (2001) Who plans mines? In: Strategic mine planning conference, Perth, WA, pp 15–18 16. Wilmott P, Howison S, Dewynne J (1995) The mathematics of financial derivatives. Cambridge University Press, Cambridge

Chapter 33

Improved Prediction of Financial Market Cycles with Artificial Neural Network and Markov Regime Switching David Liu and Lei Zhang

Abstract This paper provides an analysis of the Shanghai Stock Exchange Composite Index Movement Forecasting for the period 1999–2009 using two competing non-linear models, univariate Markov Regime Switching model and Artificial Neural Network Model (RBF). The experiment shows that RBF is a useful method for forecasting the regime duration of the Moving Trends of Stock Composite Index. The framework employed also proves useful for forecasting Stock Composite Index turning points. The empirical results in this paper show that ANN method is preferable to Markov-Switching model to some extent.

33.1 Introduction Many studies conclude that stock returns can be predicted by means of macroeconomic variables with an important business cycle component. Due to the fact that the change in regime should be considered as a random event

D. Liu (&) L. Zhang Department of Mathematical Sciences, Xi’an Jiaotong Liverpool University, SIP, 215123, Suzhou, China e-mail: [email protected] L. Zhang e-mail: [email protected] L. Zhang University of Liverpool, Liverpool, UK

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_33, Ó Springer Science+Business Media B.V. 2011

405

406

D. Liu and L. Zhang

and not predictable, which could motivate to analyze the Shanghai Stock Exchange Composite Index within this context. There is much empirical support that macroeconomic conditions should affect aggregate equity prices, accordingly, macroeconomic factors would be possibly used for security returns. In order to study the dynamics of the market cycles which evolved in the Shanghai Stock Exchange Market, the Composite Index is first modeled in regime switching within a univariate Markov-Switching framework (MRS). One key feature of the MRS model is to estimate the probabilities of a specific state at a time. Past research have developed the econometric methods for estimating parameters in regime-switching models, and demonstrated how regime-switching models could characterize time series behavior of some variables, which was better than the existing single-regime models. The concept about Markov Switching Regimes firstly dates back to ‘‘Microeconomic Theory: A Mathematical Approach’’ [1]. Hamilton [2] applied this model to the study of the United States business cycles and regime shifts from positive to negative growth rates in real GNP. Hamilton [2] extended Markov regime-switching models to the case of auto correlated dependent data. Hamilton and Lin also report that economic recessions are a main factor in explaining conditionally switching moments of stock market volatility [3, 4]. Similar evidences of regime switching in the volatility of stock returns have been found by Hamilton and Susmel [5], Edwards and Susmel [6], Coe [7] and [8]. Secondly, this paper deals with application of neural network method, a Radial Basis Function (RBF), on the prediction of the moving trends of the Shanghai Stock. RBFs have been employed in time series prediction with success as they can be trained to find complex relationships in the data [9]. A large number of successful applications have shown that ANN models have received considerable attention as a useful vehicle for forecasting financial variables and for time-series modeling and forecasting [10, 11]. In the early days, these studies focused on estimating the level of the return on stock price index. Current studies have reflected an interest in selecting the predictive factors as a variety of input variables to forecast stock returns by applying neural networks. Several techniques such as regression coefficients [12], autocorrelations [13], backward stepwise regression [14], and genetic algorithms [14] have been employed by researchers to perform variable subset selection [12, 13]. In addition, several researchers subjectively selected the subsets of variables based on empirical evaluations [14]. The paper is organized as follows. Section 33.2 is Data Description and Preliminary Statistics. Section 33.3 presents the research methodology. Section 33.4 presents and discusses the empirical results. The final section provides with summary and conclusion.

33

Improved Prediction of Financial Market Cycles

407

Table 33.1 Model summary Model R

R square

Adjusted R square

Std. error

1 2 3 4 5

0.427 0.597 0.695 0.763 0.800

0.422 0.590 0.688 0.755 0.791

768.26969 647.06456 564.83973 500.42457 461.69574

0.653 0.773 0.834 0.873 0.894

33.2 Data Description and Preliminary Statistics 33.2.1 Data Description This paper adopts two non-linear models, Univariate Markov Switching model and Artificial Neural Network Model with respect to the behavior of Chinese Stock Exchange Composite Index using data for the period from 1999 to 2009. As Shanghai Stock Exchange is the primary stock market in China and Shanghai A Share Composite is the main index reflection of Chinese Stock Market, this research adopts the Shanghai Composite (A Share). The data consist of daily observations of the Shanghai Stock Exchange Market general price index for the period 29 October 1999 to 31 August 2009, excluding all weekends and holidays giving a total of 2369 observations. For both the MRS and the ANN models, the series are taken in natural logarithms.

33.2.2 Preliminary Statistics In this part we will explore the relationship among Shanghai Composite and Consumer Price Index, Retail Price Index, Corporate Goods Price Index, Social Retail Goods Index, Money Supply, Consumer Confidence Index, Stock Trading by using various t-tests, and regression analysis to pick out the most relevant variables as the influence factors in our research. By using regression analysis we test the hypothesis and identify correlations between the variables. In the following multiple regression analysis we will test the following hypothesis and see whether they hold true: H0 ¼ b1 ¼ b2 ¼ b3 ¼ ¼ bK ¼ 0 H1 ¼ At least some of the b is not equal 0 ðregression insignificantÞ: In Table 33.1, R-square (R2 ) is the proportion of variance in the dependent variable (Shanghai Composite Index) which can be predicted from the independent variables. This value indicates that 80% of the variance in Shanghai Composite Index can be predicted from the variables Consumer Price Index, Retail Price

408

D. Liu and L. Zhang

Table 33.2 ANOVA Model 1

2

3

4

5

Regression Residual Total Regression Residual Total Regression Residual Total Regression Residual Total Regression Residual Total

Sum of squares

df

Mean square

F

Sig.

5.323E7 7.142E7 1.246E8 7.440E7 5.024E7 1.246E8 8.668E7 3.797E7 1.246E8 9.510E7 2.955E7 1.246E8 9.971E7 2.494E7 1.246E8

1 121 122 2 120 122 3 119 122 4 118 122 5 117 122

5.323E7 590238.317

90.178

0.000

3.720E7 418692.539

88.851

0.000

2.889E7 319043.922

90.561

0.000

2.377E7 250424.748

94.934

0.000

1.994E7 213162.957

93.549

0.000

Index, Corporate Goods Price Index, Social Retail Goods Index, Money Supply, Consumer Confidence Index, and Stock Trading. It is worth pointing out that this is an overall measure of the strength of association, and does not reflect the extent to which any particular independent variable is associated with the dependent variable. In Table 33.2, the p-value is compared to alpha level (typically 0.05). This gives the F-test which is significant as p-value = 0.000. This means that we reject the null that Stock Trading, Consumer Price Index, Consumer Confidence Index, Corporate Goods Price Index, Money Supply have no effect on Shanghai Composite. The p value (Sig.) from the F-test in ANOVA table is 0.000, which is less than 0.001, implying that we reject the null hypothesis that the regression coefficients (b’s) are all simultaneously correlated. By looking at the Sig. column in particular, we gather that Stock Trading, Consumer Price Index, Consumer Confidence Index, Corporate Goods Price Index, Money Supply are variables with p-values less than 0.02 and hence VERY significant. Then look at Fig. 33.1, the correlation numbers measure the strength and direction of the linear relationship between the dependent and independent variables. To show these correlations visually we use partial regression plots. Correlation points tend to form along a line going from the bottom left to the upper right, which is the same as saying that the correlation is positive. We conclude that Stock Trading, Consumer Price Index, Consumer Confidence Index, Corporate Goods Price Index, Money Supply and their correlation with Shanghai Composite Index is positive because the points tend to form along this line.

33

Improved Prediction of Financial Market Cycles

409

Fig. 33.1 Normal P–P plot regression standardized residual

Fig. 33.2 China CPI, CGPI and Shanghai A Share Composite Index

Due to CPI Index, CGPI Index and Money Supply Increased Ratio (M1 Increased Ratio - M2 Increased Ratio) are the most correlated influence factors with Share Composite among other factors, therefore, we choose macroeconomic indicators as mentioned by Qi and Maddala [12], CPI Index, CGPI Index and Money Supply Increased Ratio (M1 Increased Ratio - M2 Increased Ratio) as well as a data set from Shanghai Stock Exchange Market are used for the experiments to test the forecasting accuracy of RBF [12]. Typically, Figs. 33.2 and 33.3 show the developments of Shanghai Composite index with CPI, CGPI and MS along time.

33.3 Empirical Models In this section, the univariate Markov Switching Model developed by Hamilton [2] was adopted to explore regime switching of Shanghai Stock Exchange Composite Index, followed by developing an artificial neural network (ANN)—a RBF method to predict stock index moving trends. We use the RBF method to find the relationship of CPI Index, CGPI Index and Money Supply Increased Ratio with Stock Composite Index. By using the Matlab Neural Network Toolbox, RBF Network is

410

D. Liu and L. Zhang

Fig. 33.3 China money supply increased (annual basis) and Shanghai A Share Composite Index

designed in a more efficient design (newrb). Finally, the forecasting performances of these two competing non-linear models are compared.

33.3.1 Markov Regime Switching Model and Estimation 33.3.1.1 Markov Regime Switching Model The comparison of the in sample forecasts is done on the basis of the Markov Switching/Hamilton filter mathematical notation, using the Marcelo Perlin (21 June 2009 updated) forecasting modeling. A potentially useful approach to model nonlinearities in time series is to assume different behavior (structural break) in one subsample (or regime) to another. If the dates of the regimes switches are known, modeling can be worked out with dummy variables. For example, consider the following regression model: yt ¼ Xt0 bst þ et ðt ¼ 1; . . .; TÞ

ð33:1Þ

where, et NIDð0; r2st Þ; bst ¼ b0 ð1 St Þ þ b1 St , r2st ¼ r20 ð1 St Þ þ r21 St ; St ¼ 0 or 1, (Regime 0 or 1). Usually it is assumed that the possible difference between the regimes is a mean and volatility shift, but no autoregressive change. That is: yt ¼ lt St þ /ðyt1 lt St1 Þ þ et

et NIDð0; r2st Þ:

ð33:2Þ

where, lt St ¼ l0 ð1 St Þ þ l1 St . If St ðt ¼ 1; . . .; TÞ is known as a priori, then the problem is just a usual dummy variable auto-regression problem. In practice, however, the prevailing regime is not usually directly observable. Denote then PðSt ¼ j=St1 ¼ iÞ ¼ Pij ; ði; j ¼ 0; 1Þ called transition probabilities, with Pi0 þ Pi1 ¼ 1; i ¼ 0; 1. This kind of process, where the current state depends only on the state before, is called a Markov process, and the model a Markov switching model in the mean and the variance. The probabilities in a Markov process can be conveniently presented in matrix form:

33

Improved Prediction of Financial Market Cycles

PðSt ¼ 0Þ PðSt ¼ 1Þ

!

¼

p00 p01

p10 p11

411

PðSt1 ¼ 0Þ PðSt1 ¼ 1Þ

!

Estimation of the transition probabilities Pij is usually done (numerically) by maximum likelihood as follows. The conditional probability densities function for the observations yt , given the state variables, St1 and the previous observations Ft1 ¼ fyt1 ; yt2 ; . . .g is h i ½yt lt St /ðyt1 lt St1 Þ2 1 2 2rs t f ðyt =St ; St1 ; Ft1 Þ ¼ qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ exp ð33:3Þ 2 2prst et ¼ yt lt St /ðyt1 lt St1 Þ NIDð0; r2st Þ The chain rule for conditional probabilities yields then for the joint probability density function for the variables yt ; St ; St1 , given past information Ft1 , f ðyt ; St ; St1 =Ft1 Þ ¼ f ðyt =St ; St1 ; Ft1 ÞPðSt ; St1 =Ft1 Þ, such that the log-likelihood function to be maximized with respect to the unknown parameters becomes " # 1 X 1 X lt ðhÞ ¼ log f ðyt =St ; St1 ; Ft1 ÞPðSt ; St1 =Ft1 Þ ð33:4Þ St ¼0 St1 ¼0 h ¼ ðp; q; /; l0 ; l1 ; r20 ; r21 Þ and the transition probabilities: p ¼ PðSt ¼ 0=St1 ¼ 0Þ and q ¼ PðSt ¼ 1=St1 ¼ 1Þ. Steady state probabilities PðS0 ¼ 1=F0 Þ and PðS0 ¼ 0=F0 Þ are called the steady state probabilities, and, given the transition probabilities p and q are obtained as: PðS0 ¼ 1=F0 Þ ¼

1p ; 2qp

PðS0 ¼ 0=F0 Þ ¼

1q : 2qp

33.3.1.2 Stock Composite Index Moving Trends Estimation In our case, we have three explanatory variables X1t ; X2t ; X3t in a Gaussian framework (Normal distribution) and the input argument S, which is equal to S ¼ ½1111, then the model for the mean equation is: yt ¼ X1t b1;St þ X2t b2;St þ X3t b3;St þ et

et NIDð0; r2st Þ

ð33:5Þ

where, St represents the state at time t, that is, St ¼ 1; . . .; K (K is the number of states); r2st is Error variance at state St ; bSt is beta coefficient for explanatory variable i at state St , where i goes from 1 to n; et is residual vector which follows a particular distribution (in this case Normal). With this change in the input argument S, the coefficients and the model’s variance are switching according to the transition probabilities. Therefore, the logic is clear: the first elements of input argument S control the switching dynamic

412

D. Liu and L. Zhang

of the mean equation, while the last terms control the switching dynamic of the residual vector, including distribution parameters. Based on Gaussian maximum likelihood, the equations are represented as following: State 1 (= 1), yt ¼ X1t b1;1 þ X2t b2;1 þ X3t b3;1 þ et ; State 2 (= 2), p11 p21 yt ¼ X1t b1;2 þ X2t b2;2 þ X3t b3;2 þ et . With as the transition matrix, p12 p22 which controls the probability of a regime switch from state j (column j) to state i (row i). The sum of each column in P is equal to one, since they represent full probabilities of the process for each state.

33.3.2 Radial Basis Function Neural Networks The specific type of ANN employed in this study is the Radial Basis Function (RBF), the most widely used among the many types of neural networks. RBFs were first used to solve the interpolation problem-fitting a curve exactly through a set of points. Fausett defines radial basis functions as ‘‘activation functions with a local field of response at the output’’ [15]. The RBF neural networks are trained to generate both time series forecasts and certainty factors. The RBF neural network is composed of three layers of nodes. The first is the input layer that feeds the input data to each of the nodes in the second or hidden layer. The second layer of nodes differs greatly from other neural networks in that each node represents a data cluster which is centered at a particular point and has a given radius. The third and final layer consists of only one node. It acts to sum the outputs of the second layer of nodes to yield the decision value [16]. sﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ n P q The ith neurons input of a hidden layer is ki ¼ ðW1ji Xjq Þ2 b1i and j¼1

output is: 0vﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 1 uX n u riq ¼ expððkiq Þ2 Þ ¼ [email protected] ðW1ji Xjq Þ2 b1i A j¼1

2 q ¼ exp W1i Xj b1i where, b1i presents threshold value, Xj is the input feature vector and the approximant output riq is differentiable with respect to the weights W1i . When an input vector is fed into each node of the hidden layer simultaneously, each node then calculates the distance from the input vector to its own center. That distance value is transformed via some function, and the result is output from the node. That value output from the hidden layer node is multiplied by a constant or

33

Improved Prediction of Financial Market Cycles

413

weighting value. That product is fed into the third layer node which sums all the products and any numeric constant inputs. Lastly, the third layer node outputs the decision value. A Gaussian basis function for the hidden units given as Zj for j ¼ 1; . . .; J, where 2 ! X lj Zj ¼ exp : 2r2 lj and rj are mean and the standard deviation respectively, of the jth unit receptive field and the norm is Euclidean. In order to obtain the tendency of A Share Composite Index, we examine the sample performance of quarterly returns (totally 40 quarters) forecasts for the Shanghai Stock Exchange Market from October 1999 to August 2009, using three exogenous macroeconomic variables, the CPI, CGPI and Money Supply (M1–M2, Increased on annual basis) as the inputs to the model. We use a Radial Basis Function network based on the learning algorithm presented above. Using the Matlab Neural Network Toolbox, the RBF network is created using an efficient design (newrb). According to Hagan et al. [17], a small spread constant can result in a steep radial basis curve while a large spread constant results in a smooth radial basis curve; therefore it is better to force a small number of neurons to respond to an input. Our interest goes to obtain a single consensus forecast output, the sign of the prediction only, which will be compared to the real sign of the prediction variable. After several tests and changes to the spread, at last we find spread = 4 is quite satisfied for out test. As a good starting value for the spread constant is between 2 and 8 [17], we set the first nine columns of y0 as the test samples [17].

33.4 Empirical Results 33.4.1 Stock Composite Index Moving Trends Estimation by MRS Table 33.3 shows the estimated coefficients of the proposed MRS along with the necessary test statistics for evaluation of Stock Composite Index Moving Trends. The Likelihood Ratio test for the null hypothesis of linearity is statistically significant and this suggests that the linearity is strongly rejected. The results in Table 33.3 further highlight several other points: First, value of the switching variable at state 1 is 0.7506, at state 2 value of the switching variable is -0.0161; and the model’s standard deviation r takes the values of 0.0893 and 0.0688 for regime 1 and regime 2 respectively; these values help us to identify regime 1 as the upward regime and regime 2 as the downward regime. Second, the duration measure shows that the upward regime lasts approximately 57 months, whereas the high volatility regime lasts approximately 24 months.

414

D. Liu and L. Zhang

Table 33.3 Stock index moving trends estimation by MRS Parameters

Estimate

Std err

l0 l2 r20 r22 Expected duration

0.7506 -0.0161 0.0893 0.0688 56.98 time periods

0.0866 0.0627 0.0078 0.0076 23.58 time periods

Transition probabilities p (regime1) q (regime0) Final log likelihood

0.98 0.96 119.9846

Fig. 33.4 Smoothed states probabilities (moving trends)

As we use the quarterly data for estimating the Moving Trends, the smoothed probabilities and filtered state probabilities lines seem exiguous. Figure 33.4 reveals the resulting smoothed probabilities of being in up and down moving trends regimes along Shanghai Stock Exchange Market general price index. Moreover, filtered States Probabilities is shown in Fig. 33.5, several periods of the sample are characterized by moving downwards associated with the presence of a rational bubble in the capital market of China from 1999 to 2009.

33.4.2 Radial Basis Function Neural Networks Interestingly, the best results we obtained from RBF training are 100% correct approximations of the sign of the test set, and 90% of the series on the training set. This conclusion on one hand is consensus with the discovery in ‘‘the Stock Market and the Business Cycle’’ by Hamilton and Lin [3]. Hamilton and Lin [3] argued that the analysis of macroeconomic fundamentals was certainly a satisfactory explanation for stock volatility. To our best knowledge, the fluctuations in the

33

Improved Prediction of Financial Market Cycles

415

Fig. 33.5 Filtered states probabilities (moving trends)

Table 33.4 RBF training output

x

y0

T

x

y0

T

0.80937 0.30922 0.96807 1.0459 -0.011928 0.92 0.81828 0.054912 0.34783 0.80987 1.1605 0.66608 0.22703 0.45323 0.69459 0.16862 0.83891 0.61556 1.0808 -0.089779

1 0 1 1 0 1 1 0 0 1 1 1 0 0 1 0 1 1 1 0

1 0 1 1 0 1 1 0 0 1 1 1 0 0 1 0 1 1 1 0

0.031984 0.80774 0.68064 0.74969 0.54251 0.91874 0.50662 0.44189 0.59748 0.69514 1.0795 0.16416 0.97289 -0.1197 0.028258 0.087562 -0.084324 1.0243 0.98467 0.0032105

0 1 1 1 1 1 1 0 1 1 1 0 1 0 0 0 0 1 1 0

0 0 0 0 1 1 1 1 1 1 1 0 1 0 0 0 0 1 1 0

level of macroeconomic variables such as CPI and CGPI and other economic activity are a key determinant of the level of stock returns [18]. On the other hand, in a related application, also showed that RBFs have the ‘‘best’’ approximation property-there is always a choice for the parameters that is better than any other possible choice—a property that is not shared by MLPs. Due to the Normal Distributions intervals, outputs is y0 ¼ FðXÞ, FðXÞ ¼ 1 if X 0:5, FðXÞ ¼ 0 if X\0:5. Table 33.4 gives the results of the outputs. From x we could know that the duration of regime 1 is 24 quarters and regime 0 is 16 quarters. The comparisons of MRS and RBF models can be seen in Table 33.5. It is clear that the RBF model outperforms the MRS model on the regime duration estimation.

416 Table 33.5 Regime comparison of stock index moving trends

D. Liu and L. Zhang Model

Regime 1 (months) Regime 0 (months)

Observed durations 66 Markov-switching 57 Radial basis function 72

54 24 48

33.5 Conclusion and Future Work In this chapter, we compared the forecasting performance of two nonlinear models to address issues with respect to the behaviors of aggregate stock returns of Chinese Stock Market. Rigorous comparisons between the two nonlinear estimation methods have been made. From the Markov-Regime Switching model, it can be concluded that real output growth is subject to abrupt changes in the mean associated with economy states. On the other hand, the ANN method developed with the prediction algorithm to obtain abnormal stock returns, indicates that stock returns should take into account the level of the influence generated by macroeconomic variables. Further study will concentrate on prediction of market volatility using this research framework. Acknowledgments This work was supported by the Pilot Funds (2009) from Suzhou Municipal Government (Singapore Industrial Park and Higher Educational Town—SIPEDI) for XJTLU Lab for Research in Financial Mathematics and Computing.

References 1. Henderson JM, Richard E (1958) Quandt, micro-economic theory: a mathematical approach. McGraw-Hill, New York 2. Hamilton JD (1989) A new approach to the economic analysis of nonstationary time series and the business cycle. Econometrica 57(2):357–384 3. Hamilton JD, Lin G (1996) Stock market volatility and the business cycle. J Appl Econom 11(5):573–593 4. Hamilton JD (1996) Specification tests in Markov-switching Time-series models. J Econom 70(1):127–157 5. Hamilton JD, Susmel R (1994) Autoregressive conditional heteroskedasticity and changes in regime. J Econom 64(1–2):307–333 6. Edwards S, Susmel R (2001) Volatility dependence and contagion in emerging equities markets. J Dev Econ 66(2):505–532 7. Coe PJ (2002) Financial crisis and the great depression: a regime switching approach. J Money Credit Bank 34(1):76–93 8. Hamilton JD (1994) Time series analysis. Princeton University Press, Princeton 9. Chen S, Cowan CFN, Grant PM (1991) Orthogonal least squares learning algorithm for radial basis function network. IEEE Trans Neural Netw 2(2):302–309 10. Swanson N, White H (1995) A model selection approach to assessing the information in the term structure using linear models and artificial neural networks. J Bus Econ Stat 13(8): 265–275 11. Zhang G, Patuwo BE, Hu MY (1998) Forecasting with artificial neural networks: the state of the art. Int J Forecast 14(1):35–62

33

Improved Prediction of Financial Market Cycles

417

12. Qi M, Maddala GS (1999) Economic factors and the stock market: a new perspective. J Forecast 18(3):151–166 13. Desai VS, Bharati R (1998) The efficiency of neural networks in predicting returns on stock and bond indices. Decis Sci 29(2):405–425 14. Motiwalla L, Wahab M (2000) Predictable variation and profitable trading of US equities: a trading simulation using neural networks. Comput Oper Res 27(11–12):1111–1129 15. Fausett L (1994) Fundamentals of neural networks: architectures, algorithms and applications. Prentice-Hall, Upper Saddle River 16. Moody J, Darken C (1989) Fast learning in networks of locally tuned processing units. Neural Comput 1(2):281–294 17. Hagan MT, Demuth HB, Beale MH (1996) Neural network design. PWS Publishing, Boston 18. Liu D, Zhang L (2010) China stock market regimes prediction with artificial neural network and markov regime switching. In: Lecture notes in engineering and computer science: proceeding of the world congress on engineering 2010, WCE 2010, 30 June–2 July, 2010 London, UK, pp 378–383

Chapter 34

Fund of Hedge Funds Portfolio Optimisation Using a Global Optimisation Algorithm Bernard Minsky, M. Obradovic, Q. Tang and Rishi Thapar

Abstract Portfolio optimisation for a Fund of Hedge Funds (‘‘FoHF’’) has to address the asymmetric, non-Gaussian nature of the underlying returns distributions. Furthermore, the objective functions and constraints are not necessarily convex or even smooth. Therefore traditional portfolio optimisation methods such as mean–variance optimisation are not appropriate for such problems and global search optimisation algorithms could serve better to address such problems. Also, in implementing such an approach the goal is to incorporate information as to the future expected outcomes to determine the optimised portfolio rather than optimise a portfolio on historic performance. In this paper, we consider the suitability of global search optimisation algorithms applied to FoHF portfolios, and using one of these algorithms to construct an optimal portfolio of investable hedge fund indices given forecast views of the future and our confidence in such views.

B. Minsky (&) R. Thapar International Asset Management Ltd., 7 Clifford Street, London, W1S 2FT, UK e-mail: [email protected] R. Thapar e-mail: [email protected] M. Obradovic Q. Tang School of Mathematical and Physical Sciences, Sussex University, Brighton, BN1 9RF, UK e-mail: [email protected] Q. Tang e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_34, Springer Science+Business Media B.V. 2011

419

420

B. Minsky et al.

34.1 Introduction The motivation for this paper was to develop a more robust approach to constructing portfolios of hedge fund investments that takes account of the issues that confront portfolio managers: 1. 2. 3. 4.

The The The The

non-Gaussian, asymmetric nature of hedge fund returns; tendency of optimisation algorithms to find corner solutions; speed in computation and efficiency in finding the solution; and desire to incorporate forecast views into the problem specification.

We describe here how each of these issues was addressed and illustrate with reference to the optimising of a portfolio of investable hedge fund indices. This paper synthesises a review of the applicability of global search optimisation algorithms for financial portfolio optimisation with the development of a Monte Carlo simulation approach to forecasting hedge fund returns and implementing the methodology into an integrated forecasting and optimisation application. In Sect. 34.2, we summarise the review of global search optimisation algorithms and their applicability to the FoHF portfolio optimisation problem. In Sect. 34.3, we describe the Monte Carlo simulation technique adopted using resampled historical returns data of hedge fund managers and also how we incorporated forecast views and confidence levels, expressed as probability outcomes, into our returns distribution data. In Sect. 34.4, we report the results of applying the methodology to a FoHF portfolio optimisation problem and in Sect. 34.5, we draw our conclusions from the study.

34.2 Review of Global Search Optimisation Algorithms The FoHF portfolio optimisation problem is an example of the typical minimisation problem in finance: f ðxÞ min gðxÞð\Þ ¼ g0 .. . hðxÞð\Þ ¼ h0 where f is non-convex and maybe non-smooth, called the objective function. The g, …, h are constraint functions, with g0, …, h0 as minimum thresholds. The variable x usually denotes the weights assigned to each asset and the constraints will usually include the buying and shorting limits on each asset. It is well known that many of the objective functions and constraints specified in financial minimisation problems are not differentiable. Traditional asset management has relied on the Markowitz specification as a mean–variance optimisation problem which is soluble by classical optimisation methods. However,

34

Fund of Hedge Funds Portfolio Optimisation

421

in FoHF portfolio optimisation the distribution of hedge fund returns are nonGaussian and the typical objective functions and constraints are not limited to simple mean, variance and higher order moments of the distribution. We have previously [1] discussed the use of performance and risk statistics such as maximum drawdown, downside deviation, co-drawdown, and omega as potential objective functions and constraint functions which are not obviously differentiable. With the ready availability of powerful computing abilities and less demand on smoothness, it is possible to look for global optimisation algorithms which do not require regularity of the objective (constraint) functions to solve the financial minimisation problem. In our review of the literature [1], we found that there are three main ideas of global optimisation; Direct, Genetic Algorithm, and Simulated Annealing. In addition, there are a number of other methods which are derived from one or more of the ideas listed above. A key characteristic of fund of fund portfolio optimisation, in common with other portfolio optimisation problems, is that the dimensionality of the problem space is large. Typically, a portfolio of hedge funds will have between 20 and 40 assets with some commingled funds having significantly more assets. This means that the search algorithm cannot conduct an exhaustive test of the whole space efficiently. For example, if we have a portfolio of 40 assets we have a 40-dimensional space, and an initial grid of 100 points on each axis produces 1040 initial test points to evaluate the region where the global minimum might be found. This would require considerable computing power and would not be readily feasible. Each of the methods we considered in our review requires an initial search set. The choice of the initial search set is important as the quality of the set impacts the workload required to find the global minimum. The actual approach to moving from the initial set to finding better and better solutions differs across the methods and our search also revealed some approaches that combine the methods to produce a hybrid algorithm. In the paper [1] we evaluated seven algorithms across the methods to identify which method and specific algorithm was best suited to our FoHF portfolio optimisation problem. The algorithms considered are described here.

34.2.1 PGSL: Probabilistic Global Search Lausanne PGSL is a hybrid algorithm, proposed by Raphael [2], drawing on the Simulated Annealing method that adapts its search grid to concentrate on regions in the search space that are favourable and to intensify the density of sampling in these attractive regions. The search space is sampled using a probability distribution function for each axis of the multi-dimensional search space. At the outset of the search process, the probability distribution function is a uniform distribution with intervals of

422

B. Minsky et al.

constant width. During the process, a probability distribution function is updated by increasing probability and decreasing the width of intervals of the regions with good functional values. A focusing algorithm is used to progressively narrow the search space by changing the minimum and maximum of each dimension of the search space.

34.2.2 MCS: Multi-Level Co-ordinate Search MCS belongs to the family of branch and bound methods and it seeks to solve bound constrained optimisation problems by combining global search (by partitioning the search space into smaller boxes) and local search (by partitioning subboxes based on desired functional values). In this way, the search is focused in favour of sub-boxes where low functional values are expected. The balance between global and local parts of the search is obtained using a multi-level approach. The sub-boxes are assigned a level, which is a measure of how many times a sub-box has processed. The global search part of the optimisation process starts with the sub-boxes that have low level values. At each level, the box with lowest functional value determines the local search process. The optimisation method is described in the paper by Huyer and Neumaier [3]. Some of the finance papers that have examined MCS include Aggregating Risk Capital [4] and Optimising Omega [5]. In Optimising Omega [5] Value-atRisk of a portfolio is calculated using marginal distributions of the risk factors and MCS is employed to search for the best-possible lower bound on the joint distribution of marginal distributions of the risk factors. Optimising Omega [5] uses MCS to optimise for Omega ratio, a non-smooth performance measure, of a portfolio.

34.2.3 MATLAB Direct The Direct Search algorithm, available in MATLAB’s Genetic Algorithm and Direct Search Toolbox, uses a pattern search methodology for solving bound linear or non-linear optimisation problems [6]. The algorithms used are Generalised Pattern Search (GPS) and Mesh Adaptive Search (MADS) algorithm. The pattern search algorithm generates a set of search directions or search points to approach an optimal point. Around each search point, an area, called a mesh, is formed by adding the current point to a scalar multiple of a set of vectors called a pattern. If a point in the mesh is found that improves the objective function at the current point, the new point becomes the current point for the next step and so on. The GPS method uses fixed direction vectors and MADS uses random vectors to define a mesh.

34

Fund of Hedge Funds Portfolio Optimisation

423

34.2.4 MATLAB Simulated Annealing The Simulated Annealing method uses probabilistic search algorithm models that model the physical process of heating a material and then slowly lowering the temperature to decrease defects, thus minimising the system energy [6]. By analogy with this physical process, each step in the Simulated Annealing algorithm replaces the current point by another point that is chosen depending on the difference between the functional values at the two points and the temperature variable, which is systematically decreased during the process.

34.2.5 MATLAB Genetic Algorithm The MATLAB’s Genetic Algorithm is based on the principles of natural selection and uses the idea of mutation to produce new points in the search for an optimised solution [6]. At each step, the Genetic Algorithm selects individuals at random from the current population to be parents and uses them to produce the children for the next generation. In this way, the population evolves toward an optimal solution.

34.2.6 TOMLAB LGO Tomlab’s Global Optimiser, TOMLAB/LGO, combines global and local search methodologies [7]. The global search is implemented using the branch and bound method and adaptive random search. The local search is implemented using a generalised reduced gradient algorithm.

34.2.7 NAG Global Optimiser NAG’s Global Optimiser, E05JBF, is based on MCS, as described above. E05JBF is described in NAG’s Library Routine Document [8] and Optimising Omega [5]. The above algorithms were evaluated on the three constrained optimisation problems. The constraints consisted of both linear constraints on the allocation weights to the assets and constraints on the level of functions that characterise the portfolio’s performance or risk. The algorithms were measured regarding time to run, percentage of corners in the optimal solution, and the deviation from the average optimal solution. A simple scoring rule combining these three factors as a weighted sum was constructed. There was considerable variation in relative performance between the algorithms across the different tests. Two algorithms, MATLAB Annealing and

424

B. Minsky et al.

MATLAB Genetics, were found to be unstable giving rise to different results when repeated runs of the same problem and environment were performed. They also produced widely different results, from very good to very bad, across the tests and were rejected from consideration easily. The other five algorithms all produced acceptable results with MATLAB Direct scoring best across the constrained optimisation examples. PGSL, the adaptive Simulated Annealing algorithm performs reasonably in most tests and has been used by IAM for the past 4 years. Therefore, we chose to compare MATLAB Direct with PGSL in our portfolio optimisation implementation.

34.3 Implementing the Global Search Optimisation Algorithm Traditional optimisation of portfolios has focused on determining the optimal portfolio given the history of asset returns and assuming that the distribution of returns is Gaussian and stationary over time. Our experience is that these assumptions do not hold and that any optimisation should use the best forecast we can make of the horizon for which the portfolio is being optimised. When investing in hedge funds, liquidity terms are quite onerous with lock ups and redemption terms from monthly to annual frequencies, and notice periods ranging from a few days to 6 months. This means that the investment horizon tends to be 6–12 months ahead to reflect the minimum time any investment will be in a portfolio. The forecast performance of the assets within the portfolio is produced using Monte Carlo simulation and re-sampling. The objective is to produce a random sample of likely outcomes period by period for the forecast horizon based on the empirical distributions observed for the assets modified by our views as to the likely performance of the individual assets. This is clearly a non-trivial exercise, further complicated by our wish to maintain the relationship between the asset distributions and any embedded serial correlation within the individual asset distribution. The approach implemented has three components: 1. Constructing a joint distribution of the asset returns from which to sample; 2. Simulating the returns of the assets over the forecast horizon; and 3. Calculating the relevant objective function and constraints for the optimisation.

34.3.1 Constructing the Joint Distribution of Asset Returns We used bootstrapping in a Monte-Carlo simulation framework to produce the distribution of future portfolio returns. Bootstrapping is a means of using the available data by resampling with replacement. This generates a richer sample than would otherwise be available. To preserve the relationship between the assets we

34

Fund of Hedge Funds Portfolio Optimisation

425

treat the set of returns for the assets in a time period as an observation of the joint distribution of the asset returns. An enhancement to this sampling scheme to capture any serial correlation is to block sample a group contiguously, say three periods together. Block sampling of three periods at a time offers around 10 million distinct samples of blocks of three time periods. As we used bootstrapping to sample from the distribution and we wished to preserve the characteristics of the joint distribution, we needed to define a time range over which we have returns for all of the assets in the portfolio. Hedge funds report returns generally on a monthly basis, which means that we needed to go back a reasonable period of time to obtain a sufficiently large number of observations to enable the bootstrap sampling to be effective. For hedge funds this is complicated because many of the funds have not been in existence for very long, with the median life of a hedge fund being approximately 3 years. Although the longer the range that can be used for the joint distribution the greater the number of points available for sampling, the lack of stationarity within the distribution leads us to select a compromise period, typically 5 years, as the desired range. Where a hedge fund does not have a complete 5 year history, we employed a backfill methodology to provide the missing data. There are a number of approaches to backfilling asset return time series such as selecting a proxy asset to fill the series; using a strategy index with a random noise component; constructing a factor model of the asset returns from the available history and using the factor return history and model to backfill; or to randomly select an asset from a set of candidate assets that could have been chosen for the portfolio for the periods that the actual asset did not exist. We adopted this last method, selecting an asset from a set of available candidates within a peer group for the missing asset. Where the range for which returns are missing was long, we repeated the exercise of selecting an asset at random from the available candidates within the peer group, say, every six periods. Our reasoning for applying this approach is that we assume as portfolio managers, given the strategy allocation of the portfolio, that we would have chosen an asset from the candidate peer group available at that time to complete the portfolio. Using this process we constructed a complete set of returns for each of the assets going back, say, 5 years. The quality of the backfill depends on how narrowly defined the candidate peer group is defined. At International Asset Management Limited (IAM), we have defined our internal set of strategy peer groups that reflect best our own interpretation of the strategies in which we invest. This is because hedge fund classifications adopted by most of the index providers tend to be broad, and can include funds that would not feature in IAM’s classifications.

34.3.2 Simulating Returns Over the Forecast Horizon We simulated the returns of the assets using a block bootstrap of the empirical joint distributions, which are modified by probabilistically shifting the expected

426

B. Minsky et al. Impact of Mixing Views

14% Pessimistic Most Likely Optimistic

12%

Mixed

10%

8%

6%

4%

2%

0% -15.0%

-10.0%

-5.0%

0.0%

5.0%

10.0%

15.0%

20.0%

25.0%

30.0%

35.0%

Fig. 34.1 Probabilistic shifting of expected mean

return of the sample according to our assessment of the likely return outcomes for the assets. First we describe the process of incorporating forecast views by expectation shifting and then we describe the block bootstrapping method. The desire to include forecast views, expressed as expected annual returns, and confidence, expressed as probabilities, within a portfolio optimisation problem has been addressed in a number of ways. Black and Litterman developed an approach where the modeller expressed a view as to the expected mean of a returns series and attached a confidence to each view. This approach is Bayesian and allows the traditional Mean–Variance approach to be adapted to allow for more stable and intuitive allocations which do not favour corner solutions. However, we have chosen an empirical approach, of mixing probabilistically mean shifted versions of the empirical distribution, to include views that allows a range of outcomes to be specified with a confidence associated with the views. Figure 34.1 shows how applying a probabilistic shift to the mean of a distribution not only repositions the distribution but changes the higher order moments as the spread, skew and kurtosis all change. In Table 34.1 the forecast views for a number of strategies are set out with associated confidence. The optimistic, pessimistic and most likely views are the best assessment of the potential expected return of the mean fund within the strategy. The confidence level represents the likelihood of that view prevailing. We note the sum of the three confidence levels is one. We use these likelihoods to determine for each asset, according to its strategy, which shift should be applied to the distribution for that simulation. This is implemented by simply sampling from the uniform distribution and dividing the distribution into three segments according to the confidence levels associated with the three views. Recognising that each asset does not track its strategy with certainty we calculate the beta for

34

Fund of Hedge Funds Portfolio Optimisation

427

Table 34.1 Forecast views and confidence by hedge fund strategy Strategy Opt. view Conf./prob. Pess. view Conf./prob. Most likely (%) (%) (%) (%) view (%) Convertible bond arbitrage Credit Event driven Fixed income rel val

Conf./prob. (%)

17.5

25

7.5

25

12.5

50

17.5 10.0 15.0

25 25 25

7.5 0.0 10.0

25 25 25

12.5 5.0 12.5

50 50 50

the asset with respect to the strategy and adjust the return by the randomly chosen shift (‘‘k’’) multiplied by the asset beta calculated. So the return in any period (‘‘t’’) for an asset (‘‘a’’) which follows strategy (‘‘s’’) for simulation trial (‘‘n’’) is: m ra;t;n ¼ raw ra;t;n þ bx shifts;k

34.3.3 Calculating the Objective Function and Constraint Functions In implementing the bootstrapped Monte Carlo simulation we simulate 500 trials or scenarios for the assets in the portfolio. This produces a distribution of returns of each asset and the distributions of any statistics we may wish to compute. Our objective and constraint functions are statistics based on the distribution of portfolio returns. With a set of asset allocation weights, the distribution of portfolio returns and statistics distributions may be calculated. It is worth discussing how we use this information within the optimisation algorithm. To do this we shall use as an example maximising expected return subject to a maximum level of maximum drawdown. As we have chosen to optimise expected return, our objective function is simply the median of the distribution of portfolio returns. If we set our objective to ensure performance is at an acceptable level in most circumstances we might choose the bottom five percentile of return as the objective function so as to maximise the least likely (defined as fifth percentile) return. This reflects the flexibility we have with using a simulated distribution as the data input into the optimisation process. In PGSL, as with almost all of the global search optimisation algorithms, both the linear and non-linear constraints are defined as penalty functions added to the objective function and hence are soft constraints rather than hard constraints that must be satisfied. The weight attached to each penalty function determines how acceptable a constraint violation is. In our example, we define the penalty function as the average of the maximum drawdown for the lowest five percentile of the maximum drawdown distribution less the constraint boundary assuming the conditional average exceeds the constraint level multiplied by an importance factor:

428

B. Minsky et al.

Table 34.2 FoHF portfolio optimisation problem Objective Maximise median portfolio return Subject to: Maximum drawdown over forecast period Total allocations for full investment Cash Within the following constraints RBC Hedge 250 Equity Market Neutral RBC Hedge 250 Equity Long/Short Directional All Long/Short Equity RBC Hedge 250 Fixed Income Arbitrage RBC Hedge 250 Macro RBC Hedge 250 Managed Futures RBC Hedge 250 Credit RBC Hedge 250 Mergers and Special Situations RBC Hedge 250 Multi-Strategy

Less than 5% 100% 10% Between Between Between Between Between Between Between Between Between

10 and 16% 14 and 20% 24 and 36% 7 and 13% 10 and 20% 10 and 20% 5 and 15% 0 and 10% 0 and 10%

Max dd penalty ¼ MaxðConstraint dd averageðMax ddnjLower 5%ileÞ; 0Þ= No: of Trials Importance

This measure is analogous to an expected tail loss or Conditional VaR (CVar) in that it is an estimate of the conditional expectation of the maximum drawdown for the lower tail of the distribution of drawdowns.

34.4 Results of Optimising a FoHF Portfolio The approach to optimising a FoHF portfolio has been implemented in MATLAB and applied to a portfolio of eight RBC Hedge 250 hedge fund strategy indices. The monthly returns for indices from July 2005 are available from the RBC website. As the simulation requires 5 years of monthly returns the series were backfilled from the IAM’s pre-determined group of candidate assets within the relevant investment strategy peer group, using random selection as previously described. The portfolio was optimised with an objective function to maximise median returns subject to constraints on the maximum and minimum allocations to each asset, a constraint on the maximum and minimum allocation to Long/Short Equity strategies and a maximum allowable maximum drawdown of 5% over the forecast horizon. Thus the optimisation problem is as set out in Table 34.2. First we noted that the total allocations satisfying the equality constraint of all capital is deployed with both PGSL and MATLAB Direct and that all the asset allocation constraints are satisfied including the constraint on all Long/Short Equity strategies by MATLAB Direct, but not by PGSL. Secondly we noted that

34

Fund of Hedge Funds Portfolio Optimisation

429

Table 34.3 Optimal allocations and results Asset Lower bound Upper bound (%) (%) Cash RBC Hedge 250 Equity Market Neutral RBC Hedge 250 Equity Long/ Short RBC Hedge 250 Fixed Income Arbitrage RBC Hedge 250 Macro RBC Hedge 250 Managed Futures RBC Hedge 250 Credit RBC Hedge 250 Mergers and Sp.Situations RBC Hedge 250 Multi-Strategy Median return Excess Tail Maximum Drawdown a

Naïve (%)

PGSL (%)

Direct (%)

10 10

10 16

10.0 13.0

10.0 15.0

10.0 16.0

14

20

17.0

16.1

20.0

7

13

10.0

12.5

13.0

10 10 5 0

20 20 15 10

15.0 15.0 10.0 5.0

12.7 12.7 16.7a 0.6

13.9 15.4 11.5 0.0

0

10 – –

– –

5.0 7.40 2.70

5.8 7.80 3.22

0.1 7.96 1.71

In breach of upper allocation constraint

with PGSL only one other allocation is near its lower or upper bounds whereas with MATLAB Direct five allocations are at or near either the lower or upper bounds. Thirdly we compared the results to a portfolio where the allocation of capital to the different assets was chosen to be the midpoint between the lower and upper bounds placed on each asset (the naïve allocation). We noted that both optimisers improved median returns (7.8 and 8.0% vs. 7.40%) and that MATLAB Direct reduced the breach of the maximum drawdown constraint (1.71% vs. 2.70%) whereas the PGSL optimisation failed to improve on this condition (3.22% vs. 2.70%). MATLAB Direct portfolio had the better maximum drawdown distribution both in terms of worst case and general performance. Also, MATLAB Direct optimised portfolio performs the best of the three portfolios in terms of cumulative returns. Finally, we noted that PGSL optimisation terminated on maximum iterations and this might explain why it failed to meet all the allocation criteria (Table 34.3).

34.5 Conclusion The review of Global Search Optimisation algorithms showed that there is a range of methods available, but their relative performance is variable. The specifics of the problem and initial conditions can impact the results significantly. In applying MATLAB Direct and PGSL to the FoHF portfolio optimisation problem, we observed that we improved on the naïve solution in both cases, but each method

430

B. Minsky et al.

presented solution characteristics that might be less desirable. PGSL was unable to find a solution that met its threshold stopping criterion whilst MATLAB Direct found a solution with many corner points. Further research studies are required to evaluate the stability of the optimiser outputs and sensitivity analysis of salient optimisation parameters.

References 1. Minsky B, Obradovic M, Tang Q, Thapar R (2008) Global optimisation algorithms for financial portfolio optimisation. Working paper, University of Sussex 2. Raphael B, Smith IFC (2003) A direct stochastic algorithm for global search. Appl Math Comput 146:729–758 3. Huyer W, Neumaier A (1999) Global optimisation by multilevel coordinate search. J Global Optim 14:331–355 4. Embrechts P, Puccetti G (2006) Aggregating risk capital, with an application to operational risk. Geneva Risk Insur 31(2):71–90 5. Kane SJ, Bartholomew-Biggs MC (2009) Optimising omega. J Global Optim 45(1) 6. Genetic Algorithm and Direct Search ToolboxTM 2 user’s guide, Mathworks. http://www.mathworks.com/access/helpdesk/help/pdf_doc/gads/gads_tb.pdf 7. User’s guide for TOMLAB/LGO1TOMLAB. http://tomopt.com/docs/TOMLAB_LGO.pdf 8. NAG Library Routine Document E05JBF, NAG. http://www.nag.co.uk/numeric/FL/ nagdoc_fl22/pdf/E05/e05jbf.pdf

Chapter 35

Increasing the Sensitivity of Variability EWMA Control Charts Saddam Akber Abbasi and Arden Miller

Abstract Control chart is the most important statistical process control (SPC) tool used to monitor reliability and performance of manufacturing processes. Variability EWMA charts are widely used for the detection of small shifts in process dispersion. For ease in computation all the variability EWMA charts proposed so far are based on asymptotic nature of control limits. It has been shown in this study that quick detection of initial out-of-control conditions can be achieved by using exact or time varying control limits. Moreover the effect of fast initial response (FIR) feature, to further increase the sensitivity of variability EWMA charts for detecting process shifts, has not been studied so far in SPC literature. It has been observed that FIR based variability EWMA chart is more sensitive to detect process shifts than the variability charts based on time varying or asymptotic control limits.

35.1 Introduction Control charts, introduced by Walter A. Shewhart in 1920s are the most important statistical process control (SPC) tool used to monitor the reliability and performance of manufacturing processes. The basic purpose of implementing control chart procedures is to detect abnormal variations in the process parameters

S. A. Abbasi (&) A. Miller Department of Statistics, The University of Auckland, Private Bag 92019, Auckland 1142, New Zealand e-mail: [email protected] A. Miller e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_35, Ó Springer Science+Business Media B.V. 2011

431

432

S. A. Abbasi and A. Miller

(location and scale). Although first proposed for the manufacturing industry, control charts have now been applied in a wide variety of disciplines, such as nuclear engineering [1], health care [2], education [3] and analytical laboratories [4, 5]. Shewhart-type control charts are the most widely used: process location is usually monitored by an X chart and process dispersion by an R or S chart. Research has shown that, due to the memoryless nature of Shewhart control charts, they do not perform well for the detection of small and moderate shifts in the process parameters. When quick detection of small shifts is desirable, cumulative sum (CUSUM) and exponentially weighted moving average (EWMA) charts are superior alternatives to Shewhart charts (for details see [6, 7]). Since the introduction of EWMA chart by [8], many researchers have examined these charts from different perspectives—see for example [5, 9–16] and the references therein. In contrast to Shewhart type charts which are only based on information of the current observations, EWMA charts make use of information from historical observations as well by adopting a varying weight scheme: the highest weight is assigned to the most recent observations and the weights decreasing exponentially for less recent observations. This helps in the earlier detection of small shifts in process (location and scale) parameters (see [6]). Monitoring process variability using EWMA chart has also attracted the attention of many researchers. Some important contributions are [17–22]. Recently [15] proposed a new EWMA chart for monitoring process dispersion, the NEWMA chart, and showed that the NEWMA chart outperformed the variability EWMA chart proposed by [19] in terms of average run length. All the variability EWMA schemes proposed so far are based on asymptotic nature of control limits. Ease of computation has been reported as the main reason for using asymptotic limits but this makes the EWMA chart insensitive to start up quality problems. It should be noted that the exact control limits of the EWMA charts vary with time and approach the asymptotic limits as time increases (see [6]). When the process is initially out-of-control, it is extremely important to detect the sources of these out-of-control conditions as early as possible so that corrective actions can be taken at an early stage. This can be achieved by using the exact limits instead of the asymptotic control limits. The sensitivity of time varying EWMA chart can be increased further by narrowing the time varying limits at process startup or adding a head start feature. In SPC framework this feature is well known as fast initial response (FIR) (for details see [6]). The effect of FIR feature for increasing the sensitivity of variability EWMA charts has not been investigated so far in SPC literature. This study investigates the performance of variability EWMA charts that use asymptotic, time varying and FIR based control limits. The comparison has been made on the basis of run length characteristics such as average run length (ARL), median run length (MDRL) and standard deviation of run length distribution (SDRL). To investigate the effect of time varying control limits and of FIR on variability EWMA chart performance, we use the NEWMA chart which was recently proposed by [15] in Journal of Quality Technology. Time varying and FIR based

35

Increasing the Sensitivity of Variability EWMA Control Charts

433

control limits are constructed for the NEWMA chart and their performance is compared to that of asymptotic control limits. The rest of the study is organized as follows: Sect. 35.2 briefly introduces structure of the NEWMA chart and further presents the design of the NEWMA chart using time varying control limits (TNEWMA chart). The next section compares run length characteristics of NEWMA and TNEWMA charts. The effect of FIR feature is then investigated and compared to asymptotic and time varying EWMA schemes. To get a better insight on the run length distribution of these charts, run length curves are also presented. The chapter finally ends with concluding remarks.

35.2 TNEWMA Chart In this section we briefly describe the structure of NEWMA chart as was proposed by [15] and construct time varying control limits for this chart. Assume the quality variable of interest X follows a normal distribution with mean lt and variance r2t (i.e. X Nðlt ; r2t Þ). Let S2t represents the sample variance and dt represents the ratio of process standard deviation rt and its true value r0 at time period t (i.e. dt ¼ rt =r0 Þ: Suppose Yt ¼ lnðS2t =r20 Þ; for an in-control process i.e. rt ¼ r0 ; Yt is approximately normally distributed with mean lY and variance r2Y where lY ¼ lnðd2t Þ

1 1 2 þ n 1 3ðn 1Þ2 15ðn 1Þ4

ð35:1Þ

and r2Y ¼

2 2 4 16 þ þ : 2 3 n 1 ðn 1Þ 3ðn 1Þ 15ðn 1Þ5

ð35:2Þ

Note that when the process is in control the statistic Zt ¼

Yt lY jrt ¼ r0 rY

ð35:3Þ

is exactly a standard normal variate. When the process is out of control, 2 2 Zt Nðct ; 1Þ; where ct ¼ lnðrt =r0 Þ =rY [15]. The EWMA statistic for monitoring process variability used by [15] is based on resetting Zt to zero whenever its value becomes negative i.e. Ztþ ¼ maxð0; Zt Þ: The NEWMA chart is based on plotting the EWMA statistic 1 Wt ¼ k Ztþ þ ð1 kÞWt1 ; ð35:4Þ 2p where the smoothing constant k is the weight assigned to most recent sample observation (0 k 1). Small values of k are effective for quick detection of small

434

S. A. Abbasi and A. Miller

process shifts. As the value of k increases the NEWMA chart performs better for the detection of large process shifts. An out of control signal occurs whenever Wt [ UCLa where rﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ k UCLa ¼ La ð35:5Þ r þ: 2 k Zt Ref. [23] showed that r2Z þ ¼ t

1 1 : 2 2p

ð35:6Þ

We will see that the exact variance of Wt is time varying and hence the exact control limit should be dependent on time approaching UCLa as t ! 1: By 1 ; we can write Wt as Defining Zt0 ¼ Ztþ 2p Wt ¼ kZt0 þ ð1 kÞWt1 :

ð35:7Þ

By continuous substitution of Wti ; i ¼ 1; 2; . . .; t; the EWMA statistic Wt can be written as (see [6, 8]): Wt ¼ k

t1 X

0 ð1 kÞi Zti þ ð1 kÞt W0 :

ð35:8Þ

i¼0

Taking the variance of both sides, we obtain VarðWt Þ ¼ k2

t1 X 0 ð1 kÞ2i VarðZti Þ þ ð1 kÞ2t VarðW0 Þ:

ð35:9Þ

i¼0 0 Þ ¼ r2Z þ : After a bit of For independent random observations Zt0 ; varðZt0 Þ ¼ varðZti t

simplification, we have " VarðWt Þ ¼

r2Z þ t

k

2

1 ð1 kÞ2t 1 ð1 kÞ2

#! :

ð35:10Þ

This further simplifies to VarðWt Þ ¼ r2Z þ t

h i k 1 ð1 kÞ2t : 2k

ð35:11Þ

For the rest of study we will refer to the variability EWMA chart based on exact variance of Wt given in Eq. 35.11 as the TNEWMA chart. The TNEWMA chart gives an out of control signal whenever Wt [ UCLt ; where sﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ k½1 ð1 kÞ2t rZtþ : ð35:12Þ UCLt ¼ Lt 2k

35

Increasing the Sensitivity of Variability EWMA Control Charts

435

UCLt converges to UCLa as t ! 1; where the rate of convergence is slower for smaller values of k:

35.3 Comparison of Run Length Characteristics of NEWMA and TNEWMA Charts To evaluate the performance of control charts, the average run length (ARL) is the most important and widely used measure. ARL indicates the mean number of observations until an out of control signal is detected by a control chart. In this study, a Monte Carlo simulation with 10,000 iterations is used to approximate run length distributions of the NEWMA and TNEWMA charts following the methods of [9, 24, 25, 26]. Note that [27, 28] indicates that even 5,000 replications are enough for finding ARLs in many control chart settings with in an acceptable error rate. To get a better insight of the performance of the proposed charts, the median and the standard deviation of the run length distribution are also provided. The summary of the run length characteristics of NEWMA and TNEWMA charts is reported in Tables 35.1 and 35.2 for different values of smoothing parameter k: In the following tables ARL denotes the average run length, SDRL denotes the standard deviation of the run length distribution and MDRL denotes the median of the run length distribution. In each table, smoothing constant k increases as we move across columns from left to right where as shift d increases as we move across rows from top to bottom. The rows corresponding to d ¼ 1 provides the run length characteristics of both charts when the process is assumed to be in statistical control. The process is said to be out-of-control for d [ 1:0: Control chart multiples La and Lt are so chosen as to give the same in control average run length of 200 (i.e. ARL0 ¼ 200) for both the charts. The results in Tables 35.1 and 35.2 indicate that for smaller values of k (which is most popular choice for EWMA charts), the out-of-control ARL (ARL1 ) of the TNEWMA chart is significantly lower than the ARL1 of NEWMA chart, see for example ARL1 ¼ 9:93 for TNEWMA chart using k ¼ 0:05 and d ¼ 1:2 while for NEWMA chart ARL1 ¼ 14:52 for same values of k and d: It indicates that TNEWMA chart requires on average nearly five less observations as compared to NEWMA chart to detect a shift of 1.2r in process variability when k ¼ 0:05: MDRL of the TNEWMA chart is also lower than MDRL of the NEWMA chart while there is a slight increase in SDRL of the TNEWMA chart as compared to NEWMA chart for lower values of k and d: Figure 35.1 presents ARL comparison of NEWMA and TNEWMA charts for some choices of k: In each plot, the size of multiplicative shift in process variability d is plotted on horizontal axis while ARL is plotted on vertical axis in logarithmic scale for better visual comparison. The effect of using time varying control limits can be clearly seen from Fig. 35.1, particularly for smaller values of k: As expected, ARL of TNEWMA chart starts to converge

436

S. A. Abbasi and A. Miller

Table 35.1 Run length characteristics of NEWMA chart when ARL0 ¼ 200 k d La 0.05 1.569

0.10 1.943

0.15 2.148

0.20 2.271

0.25 2.362

0.30 2.432

0.50 2.584

0.70 2.650

0.90 2.684

1.00 2.693

1.0 ARL 199.69 200.74 200.24 199.80 199.23 200.39 199.88 199.82 199.11 199.52 MDRL 136.00 140.50 139.00 137.00 138.00 142.00 139.00 139.00 141.00 137.00 SDRL 197.62 198.83 200.98 197.14 197.55 203.09 197.02 196.87 195.46 202.39 1.1 ARL 31.68 35.33 37.69 40.11 41.38 43.26 49.42 54.11 61.16 65.19 MDRL 24.00 26.00 28.00 29.00 29.00 31.00 35.00 38.00 43.00 46.00 SDRL 26.26 30.77 34.30 36.79 38.99 41.50 48.52 55.14 59.77 64.62 1.2 ARL 14.52 14.81 15.48 15.97 16.48 17.30 19.66 22.22 25.87 28.44 MDRL 12.00 12.00 12.00 12.00 12.00 13.00 14.00 16.00 18.00 20.00 SDRL 10.05 11.06 12.12 13.13 13.93 15.41 18.54 21.10 25.47 27.88 1.3 ARL 9.21 9.06 9.07 9.25 9.34 9.57 10.34 11.73 13.89 14.97 MDRL 8.00 8.00 7.00 7.00 7.00 7.00 8.00 8.00 10.00 10.00 SDRL 5.49 5.87 6.43 6.83 7.35 7.55 8.89 11.04 13.51 14.43 1.4 ARL 6.72 6.53 6.45 6.42 6.46 6.47 6.76 7.31 8.46 9.23 MDRL 6.00 6.00 5.00 5.00 5.00 5.00 5.00 5.00 6.00 7.00 SDRL 3.67 3.92 4.09 4.26 4.54 4.71 5.63 6.43 7.78 8.85 1.5 ARL 5.30 5.18 4.98 4.87 4.81 4.78 4.86 5.15 5.86 6.38 MDRL 5.00 5.00 4.00 4.00 4.00 4.00 4.00 4.00 4.00 5.00 SDRL 2.69 2.83 2.89 2.93 3.12 3.26 3.74 4.32 5.22 5.93 1.6 ARL 4.52 4.27 4.10 3.99 3.93 3.87 3.83 3.95 4.41 4.61 MDRL 4.00 4.00 4.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 SDRL 2.15 2.16 2.25 2.30 2.39 2.42 2.72 3.10 3.79 4.03 1.7 ARL 3.91 3.67 3.50 3.41 3.31 3.27 3.15 3.19 3.43 3.69 MDRL 4.00 3.00 3.00 3.00 3.00 3.00 3.00 2.00 3.00 3.00 SDRL 1.74 1.74 1.81 1.83 1.89 1.95 2.15 2.40 2.78 3.16 1.8 ARL 3.47 3.26 3.11 3.01 2.91 2.85 2.68 2.70 2.86 3.01 MDRL 3.00 3.00 3.00 3.00 3.00 2.00 2.00 2.00 2.00 2.00 SDRL 1.46 1.52 1.52 1.54 1.60 1.63 1.78 1.90 2.26 2.48 1.9 ARL 3.17 2.96 2.83 2.71 2.62 2.51 2.37 2.37 2.46 2.58 MDRL 3.00 3.00 3.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 SDRL 1.30 1.30 1.34 1.39 1.40 1.38 1.47 1.62 1.84 2.00 2.0 ARL 2.92 2.73 2.59 2.43 2.37 2.29 2.13 2.12 2.16 2.24 MDRL 3.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 SDRL 1.13 1.17 1.18 1.18 1.19 1.21 1.29 1.41 1.54 1.69

towards ARL of NEWMA chart with an increase in k: At k ¼ 1; UCLt ¼ UCLa as the factor ð1 ð1 kÞ2t Þ reduces to 1 and hence the ARL performance of both the charts is similar. Moreover Fig. 35.2 shows percentage decrease in ARL1 of TNEWMA chart as compared to NEWMA chart for certain choices of k and d: We can see that the difference in ARL1 of both the charts is bigger for smaller values of k and higher values of d: The difference tends to reduce as k increases and d decreases. Hence the use of exact control limits also improves variability EWMA chart performance for detecting shifts of higher magnitude.

35

Increasing the Sensitivity of Variability EWMA Control Charts

437

Table 35.2 Run length characteristics of TNEWMA chart when ARL0 ¼ 200 k k Lt 0.05 1.649

0.10 1.975

0.15 2.164

0.20 2.279

0.25 2.379

0.30 2.440

0.50 2.588

0.70 2.652

0.90 2.685

1.00 2.693

1.0 ARL 199.85 200.82 200.30 200.19 200.29 200.37 200.13 200.63 199.76 199.73 MDRL 124.00 134.00 136.50 134.00 138.00 137.00 139.00 141.00 140.00 140.00 SDRL 209.87 206.10 212.73 199.28 208.94 202.42 200.30 199.74 200.94 194.83 1.1 ARL 25.80 31.45 34.71 37.99 40.25 42.42 49.21 54.01 61.15 65.18 MDRL 16.00 22.00 25.00 26.00 28.00 30.00 34.00 38.00 43.00 45.00 SDRL 28.83 32.55 34.31 37.60 40.52 41.58 49.17 53.70 60.63 63.66 1.2 ARL 9.93 11.91 13.18 14.08 15.52 16.65 19.29 21.96 25.67 28.43 MDRL 6.00 8.00 10.00 10.00 11.00 12.00 14.00 15.00 18.00 20.00 SDRL 10.36 11.59 12.53 13.07 14.56 15.26 18.48 21.30 25.29 28.12 1.3 ARL 5.61 6.64 7.37 7.81 8.33 8.66 9.98 11.39 13.66 14.94 MDRL 4.00 5.00 6.00 6.00 6.00 6.00 7.00 8.00 10.00 11.00 SDRL 5.64 6.09 6.54 6.88 7.30 7.73 9.41 10.90 13.12 14.40 1.4 ARL 3.78 4.56 4.92 5.19 5.46 5.67 6.48 7.09 8.31 9.23 MDRL 3.00 3.00 4.00 4.00 4.00 4.00 5.00 5.00 6.00 7.00 SDRL 3.49 3.97 4.21 4.33 4.60 4.69 5.67 6.26 8.05 8.67 1.5 ARL 2.84 3.33 3.68 3.86 4.03 4.14 4.61 5.01 5.78 6.38 MDRL 2.00 2.00 3.00 3.00 3.00 3.00 4.00 4.00 4.00 5.00 SDRL 2.49 2.78 3.01 3.10 3.26 3.28 3.80 4.40 5.24 5.82 1.6 ARL 2.32 2.70 2.94 3.04 3.17 3.27 3.60 3.82 4.30 4.58 MDRL 2.00 2.00 2.00 2.00 3.00 3.00 3.00 3.00 3.00 3.00 SDRL 1.90 2.16 2.25 2.30 2.37 2.48 2.82 3.16 3.69 4.17 1.7 ARL 2.00 2.29 2.43 2.55 2.66 2.68 2.92 3.07 3.38 3.65 MDRL 1.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 3.00 3.00 SDRL 1.50 1.70 1.81 1.85 1.95 1.95 2.16 2.41 2.77 3.08 1.8 ARL 1.77 2.00 2.14 2.23 2.28 2.26 2.47 2.59 2.81 3.01 MDRL 1.00 1.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 SDRL 1.23 1.42 1.53 1.58 1.60 1.64 1.78 1.92 2.24 2.47 1.9 ARL 1.60 1.78 1.91 1.97 2.02 2.03 2.18 2.27 2.39 2.56 MDRL 1.00 1.00 1.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 SDRL 1.04 1.17 1.28 1.30 1.34 1.35 1.47 1.60 1.90 2.06 2.0 ARL 1.47 1.64 1.72 1.76 1.82 1.84 1.95 2.02 2.09 2.24 MDRL 1.00 1.00 1.00 1.00 1.00 1.00 2.00 2.00 2.00 2.00 SDRL 0.91 1.03 1.09 1.14 1.17 1.17 1.26 1.33 1.52 1.64

35.4 Effect of Fast Initial Response on Variability EWMA Chart We have seen in the previous section that the use of time varying control limits as compared to asymptotic limits significantly improves the out-of-control run length behavior of variability EWMA charts. A further increase in the sensitivity of EWMA chart to detect shifts in variability can be achieved by using an FIR feature. The FIR feature, introduced by [29] for CUSUM charts, detects

438

S. A. Abbasi and A. Miller λ = 0.25

0.0 0.5 1.0 1.5 2.0 2.5

NEWMA TNEWMA Log (ARL)

Log (ARL)

0.0 0.5 1.0 1.5 2.0 2.5

λ = 0.15

NEWMA TNEWMA

NEWMA TNEWMA

1.0 1.2 1.4 1.6 1.8 2.0

1.0 1.2 1.4 1.6 1.8 2.0

δ

δ

δ

λ = 0.50

λ = 0.70

λ = 1.00

1.0 1.2 1.4 1.6 1.8 2.0 δ

1.0 1.2 1.4 1.6 1.8 2.0

0.0 0.5 1.0 1.5 2.0 2.5

NEWMA TNEWMA Log (ARL)

Log (ARL)

NEWMA TNEWMA

0.0 0.5 1.0 1.5 2.0 2.5

Log (ARL) Log (ARL)

0.0 0.5 1.0 1.5 2.0 2.5

0.0 0.5 1.0 1.5 2.0 2.5

λ = 0.05

NEWMA TNEWMA

1.0 1.2 1.4 1.6 1.8 2.0

1.0 1.2 1.4 1.6 1.8 2.0

δ

δ

40 35 30 25 20 15 10

Percentage Decrease in ARL

45

δ= 1.2 δ= 1.4 δ= 1.6 δ= 1.8 δ= 2.0

0

5

Fig. 35.2 Percentage decrease in out-of-control ARL of TNEWMA chart as compared to NEWMA chart when ARL0 ¼ 200

50

Fig. 35.1 ARL comparison of NEWMA and TNEWMA charts for different values of k when ARL0 ¼ 200

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

λ

out-of-control signals more quickly at process startup by assigning some nonzero constant to the starting values of CUSUM chart statistics. Lucas and Saccucci [9] proposed the idea of applying the FIR feature to EWMA control structures by using two one-sided EWMA charts. Rhoads et al. [30] used the

35

Increasing the Sensitivity of Variability EWMA Control Charts

439

FIR approach for time varying control limits and showed superior performance of their proposed scheme compared to the [9] FIR scheme. Both these schemes were criticized as they require the use of two EWMA charts instead of one for monitoring changes in process parameters. Steiner [11] presented another FIR scheme for EWMA charts. His proposal is based on further narrowing the time varying control limits by using an exponentially decreasing FIR adjustment which is defined as FIRadj ¼ 1 ð1 f Þ1þaðt1Þ ;

ð35:13Þ

where a is known as the adjustment parameter and is chosen such that the FIR adjustment has very little effect after a specified time period say at t ¼ 20; we have FIRadj ¼ 0:99: The effect of this FIR adjustment decreases with time and makes the control limit a proportion f of the distance from the starting value [11]. By comparing run length characteristics, Steiner [11] showed that his proposed FIR scheme outperformed the previous FIR schemes by [9, 30]. The FIR adjustment used by [11] is very attractive and has also been recently applied by [31] to generally weighted moving average control charts. In this section we examine the effect of FIR on the performance of variability EWMA chart. The time varying variability EWMA chart using FIR will be referred as the FNEWMA chart for the rest of study. The FNEWMA chart signals an out-of-control condition whenever Wt exceeds UCLf ; where UCLf is given as sﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ k½1 ð1 kÞ2t rZtþ : ð35:14Þ UCLf ¼ Lf 1 ð1 f Þ1þaðt1Þ 2k To obtain a substantial benefit from FIR feature, f should be fairly small. In this study we used f ¼ 0:5 and limited the effect of FIR adjustment till t ¼ 20 following [11, 31]. The run length characteristics of FNEWMA chart are reported in Table 35.3. ARL0 for FNEWMA chart is also fixed at 200 by using appropriate Lf values for different choices of k: By comparing results in Tables 35.1, 35.2 and 35.3, we can observe the superior run length performance of the FNEWMA chart as compared to the NEWMA and TNEWMA charts. For example, the FNEWMA chart has ARL1 ¼ 10:72 for k ¼ 0:3 and d ¼ 1:2; while the corresponding ARL1 for the TNEWMA and NEWMA charts are 16.65 and 17.30 respectively. This indicates that the FNEWMA chart requires on average nearly six less observations as compared to the NEWMA and TNEWMA charts to detect a shift of 1:2r in process variability when k ¼ 0:3: Figure 35.3 compares the ARLs of the NEWMA, TNEWMA and FNEWMA charts for some choices of k: We can easily observe that the ARL1 of the FNEWMA chart is consistently lower than the ARL1 of both NEWMA and TNEWMA charts for every choice of k: This indicates that the FNEWMA chart detects shifts in process variability more quickly than the other two charts, the difference seems greater for higher values of k which is consistent with the findings of [11].

440

S. A. Abbasi and A. Miller

Table 35.3 Run Length Characteristics of FNEWMA chart when ARL0 ¼ 200 k d Lf 0.05 1.740

0.10 2.071

0.15 2.241

0.20 2.369

0.25 2.460

0.30 2.530

0.50 2.670

0.70 2.736

0.90 2.770

1.00 2.784

1.0 ARL 199.24 200.49 199.63 199.99 200.21 200.74 199.96 200.29 199.55 199.61 MDRL 94.00 115.50 117.00 121.00 122.00 122.00 117.50 114.00 109.00 109.00 SDRL 263.25 249.11 244.59 239.14 240.99 242.92 241.41 249.86 253.24 251.52 1.1 ARL 21.17 25.84 28.27 31.12 31.40 34.36 38.57 41.54 45.85 50.85 MDRL 7.00 12.00 14.00 15.00 14.00 15.00 16.00 14.50 14.00 16.00 SDRL 29.33 34.37 36.66 40.77 41.93 46.10 52.51 58.65 68.32 74.19 1.2 ARL 8.02 8.85 9.16 9.65 10.29 10.72 11.91 13.07 14.25 15.19 MDRL 4.00 4.00 4.00 4.00 4.00 5.00 5.00 4.00 4.00 4.00 SDRL 10.14 11.57 12.57 12.72 13.77 14.46 17.20 20.49 22.95 27.47 1.3 ARL 4.08 4.65 4.77 5.06 5.24 5.29 5.61 5.88 6.53 7.07 MDRL 2.00 2.00 2.00 2.00 3.00 3.00 3.00 3.00 3.00 3.00 SDRL 5.15 5.75 5.83 6.16 6.60 6.68 7.53 8.35 10.09 11.57 1.4 ARL 2.76 3.12 3.28 3.28 3.29 3.39 3.45 3.56 3.78 3.96 MDRL 1.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 SDRL 3.18 3.48 3.60 3.65 3.62 3.84 4.07 4.31 5.23 5.63 1.5 ARL 2.11 2.33 2.42 2.48 2.50 2.54 2.58 2.51 2.65 2.80 MDRL 1.00 1.00 1.00 1.00 1.00 2.00 2.00 2.00 2.00 2.00 SDRL 2.07 2.30 2.38 2.49 2.42 2.51 2.60 2.56 2.95 3.34 1.6 ARL 1.80 1.94 1.99 2.02 2.06 2.06 2.09 2.12 2.13 2.19 MDRL 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 SDRL 1.57 1.73 1.74 1.79 1.84 1.83 1.82 1.90 1.93 2.16 1.7 ARL 1.58 1.68 1.70 1.75 1.76 1.80 1.77 1.79 1.83 1.87 MDRL 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 SDRL 1.20 1.30 1.32 1.38 1.38 1.44 1.40 1.38 1.48 1.57 1.8 ARL 1.44 1.52 1.55 1.57 1.59 1.58 1.61 1.61 1.62 1.61 MDRL 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 SDRL 0.96 1.07 1.10 1.12 1.13 1.11 1.12 1.12 1.18 1.16 1.9 ARL 1.34 1.40 1.43 1.44 1.47 1.46 1.46 1.48 1.49 1.50 MDRL 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 SDRL 0.79 0.89 0.91 0.91 0.95 0.93 0.90 0.93 0.93 0.98 2.0 ARL 1.26 1.32 1.34 1.36 1.36 1.38 1.40 1.39 1.40 1.41 MDRL 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 SDRL 0.66 0.74 0.79 0.78 0.77 0.81 0.81 0.80 0.83 0.84

To get more insight into the run length distributions of the NEWMA, TNEWMA and FNEWMA charts, Fig. 35.4 presents run length curves (RLCs) of these charts for certain values of k using d ¼ 1:2: We can observe that for smaller values of k; RLCs of TNEWMA chart are higher than RLCs of NEWMA chart indicating that TNEWMA chart has greater probability for shorter run lengths for these k values. The superiority of FNEWMA chart over NEWMA and TNEWMA charts is also clear for all values of k: Note that this high probability at shorter run lengths indicate that the shifts in the process variability will be detected quickly with high probability.

Increasing the Sensitivity of Variability EWMA Control Charts

2.5 2.0 0.5 0.0

1.0

1.4

1.8

1.0

1.4

1.8 δ

λ = 0.50

λ = 0.70

λ = 1.00

2.0 0.5 0.0

0.5 0.0 1.8 δ

NEWMA TNEWMA FNEWMA

1.5

1.5

Log (ARL)

2.0

NEWMA TNEWMA FNEWMA

1.0

Log (ARL)

2.0 1.5 1.0 0.5 0.0

1.4

2.5

δ

2.5

δ

NEWMA TNEWMA FNEWMA

1.0

1.5

Log (ARL)

1.5

1.8

NEWMA TNEWMA FNEWMA

1.0

2.5 2.0

NEWMA TNEWMA FNEWMA

0.0

0.0

1.4

2.5

1.0

Log (ARL)

λ = 0.25

0.5

1.0

1.5

Log (ARL)

2.0

NEWMA TNEWMA FNEWMA

0.5

Log (ARL)

λ = 0.15

1.0

2.5

λ = 0.05

441

1.0

35

1.0

1.4

1.8 δ

1.0

1.4

1.8 δ

Fig. 35.3 ARL comparison of NEWMA, TNEWMA and FNEWMA charts for different values of k when ARL0 ¼ 200

35.5 Conclusions This chapter examines the performance of variability EWMA chart using asymptotic, time varying and FIR based control limits. It has been shown that the ability of the variability EWMA chart to detect shifts in variation can be improved by using exact (time varying limits) instead of asymptotic control limits, particularly for smaller values of smoothing parameter k: The FIR feature has also shown to contribute significantly to further increase the sensitivity of the EWMA chart to detect shifts in process variability. Computations have been performed using NEWMA chart but these results can be generalized for the other variability EWMA charts discussed in Sect. 35.1. This study will help quality practitioners to choose a more sensitive variability EWMA chart.

442

S. A. Abbasi and A. Miller

30

40

0.8 0.6 0.4 0.2

10

20

30

40

40

60

Run Length

80

100

60

0.4

0.6

0.8

50

NEWMA TNEWMA FNEWMA

0.2

Cumulative Probability

0.4

0.6

0.8

1.0

λ = 0.90

0.2

20

0

λ = 0.50

NEWMA TNEWMA FNEWMA 0

Cumulative Probability

50

Run Length

0.0

Cumulative Probability

20

Run Length

0.0

10

1.0

0

NEWMA TNEWMA FNEWMA

0.0

0.8 0.6 0.4 0.2

NEWMA TNEWMA FNEWMA

0.0

Cumulative Probability

1.0

λ = 0.20

1.0

λ = 0.05

0

20

40

60

80

100 120

Run Length

Fig. 35.4 Run length curves of NEWMA, TNEWMA and FNEWMA charts for different values of k when d ¼ 1:2 and ARL0 ¼ 200

References 1. Hwang SL, Lin JT, Liang GF, Yau YJ, Yenn TC, Hsu CC (2008) Application control chart concepts of designing a pre-alarm system in the nuclear power plant control room. Nucl Eng Design 238(12):3522–3527 2. Woodall WH (2006) The use of control charts in health-care and public-health surveillance. J Qual Technol 38(2):89–104 3. Wang Z, Liang R (2008) Discuss on applying SPC to quality management in university education. In: Proceedings of the 9th international conference for young computer scientists, ICYCS 2008, pp 2372–2375 4. Masson P (2007) Quality control techniques for routine analysis with liquid chromatography in laboratories. J Chromatogr A 1158(1–2):168–173 5. Abbasi SA (2010) On the performance of EWMA chart in presence of two component measurement error. Qual Eng 22(3):199–213 6. Montgomery DC (2001) Introduction to statistical quality control, 4th edn. Wiley, New York 7. Ryan PR (2000) Statistical methods for quality improvement, 2nd edn. Wiley, New York 8. Roberts SW (1959) Control chart tests based on geometric moving averages. Technometrics 1(3):239–250

35

Increasing the Sensitivity of Variability EWMA Control Charts

443

9. Lucas JM, Saccucci MS (1990) Exponentially weighted moving average control schemes. Properties and enhancements. Technometrics 32(1):1–12 10. Montgomery DC, Torng JCC, Cochran JK, Lawrence FP (1995) Statistically constrained economic design of the EWMA control chart. J Qual Technol 27(3):250–256 11. Steiner SH (1999) EWMA control charts with time-varying control limits and fast initial response. J Qual Technol 31(1):75–86 12. Chan LK, Zhang J (2000) Some issues in the design of EWMA charts. Commun Stat Part B Simul Comput 29(1):207–217 13. Maravelakis PE, Panaretos J, Psarakis S (2004) EWMA chart and measurement error. J Appl Stat 31(4):445–455 14. Carson PK, Yeh AB (2008) Exponentially weighted moving average (EWMA) control charts for monitoring an analytical process. Ind Eng Chem Res 47(2):405–411 15. Shu L, Jiang W (2008) A new EWMA chart for monitoring process dispersion. J Qual Technol 40(3):319–331 16. Abbasi SA (2010) On sensitivity of EWMA control chart for monitoring process dispersion. In: Lecture notes in engineering and computer science: proceedings of the World Congress on engineering 2010, vol III, WCE 2010, 30 June–2 July, 2010, London, UK, pp 2027–2032 17. Wortham AW, Ringer LJ (1971) Control via exponential smoothing. Transportation Logistic Rev 7:33–39 18. Domangue R, Patch SC (1991) Some omnibus exponentially weighted moving average statistical process monitoring schemes. Technometrics 33:299–313 19. Crowder SV, Hamilton M (1992) Average run lengths of EWMA controls for monitoring a process standard deviation. J Qual Technol 24:44–50 20. MacGregor JF, Harris TJ (1993) The exponentially weighted moving variance. J Qual Technol 25:106–118 21. Stoumbos ZG, Reynolds MR Jr (2000) Robustness to non normality and autocorrelation of individual control charts. J Stat Comput Simul 66:145–187 22. Chen GM, Cheng SW, Xie HS (2001) Monitoring process mean and variability with one EWMA chart. J Qual Technol 33:223–233 23. Barr DR, Sherrill ET (1999) Mean and variance of truncated normal distributions. Am Stat 53:357–361 24. Maravelakis P, Panaretos J, Psarakis S (2005) An examination of the robustness to nonnormality of the EWMA control charts for the dispersion. Commun Stat Simul Comput 34(4):1069–1079 25. Neubauer AS (1997) The EWMA control chart: properties and comparison with other quality-control procedures by computer simulation. Clin Chem 43(4):594–601 26. Zhang L, Chen G (2004) EWMA charts for monitoring the mean of censored Weibull lifetimes. J Qual Technol 36(3):321–328 27. Kim MJ (2005) Number of replications required in control chart Monte Carlo simulation studies. PhD Dissertation, University of Northern Colorado 28. Schaffer JR, Kim MJ (2007) Number of replications required in control chart Monte Carlo simulation studies. Commun Stat Simul Comput 36(5):1075–1087 29. Lucas JM, Crosier RB (1982) Fast initial response for CUSUM quality control schemes: give your CUSUM a head start. Technometrics 24(3):199–205 30. Rhoads TR, Montgomery DC, Mastrangelo CM (1996) A fast initial response scheme for the exponentially weighted moving average control chart. Qual Eng 9(2):317–327 31. Chiu WC (2009) Generally weighted moving average control charts with fast initial response features. J Appl Stat 36(3):255–275

Chapter 36

Assessing Response’s Bias, Quality of Predictions, and Robustness in Multiresponse Problems Nuno Costa, Zulema Lopes Pereira and Martín Tanco

Abstract Optimization measures for evaluating compromise solutions in multiresponse problems formulated in the Response Surface Methodology framework are proposed. The measures take into account the desired properties of responses at optimal variable settings, namely, the bias, quality of predictions and robustness, which allow the analyst to achieve compromise solutions of interest and feasible in practice, namely in the case of a method that does not consider in the objective function the responses’ variance level and correlation information is used. Two examples from the literature show the utility of the proposed measures.

36.1 Introduction Statistical tools and methodologies like the response surface methodology (RSM) have been increasingly used in industry and became a change agent in the way design and process engineers think and work [9]. In particular, RSM has been used for developing more robust systems (process and product), improving and N. Costa (&) Setúbal Polytechnic Institute, College of Technology, Campus do IPS, Estefanilha, 2910-761 Setúbal, Portugal e-mail: [email protected] N. Costa Z. L. Pereira UNIDEMI/DEMI, Faculdade de Ciências e Tecnologia, Universidade Nova de Lisboa, 2829-516 Caparica, Portugal e-mail: [email protected] M. Tanco CITEM, Universidad de Montevideo, Luis P. Ponce, 1307, 11300 Montevideo, Uruguay e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_36, Ó Springer Science+Business Media B.V. 2011

445

446

N. Costa et al.

optimizing systems performance with the required efficiency and effectiveness. The readers are referred to Myers et al. [18] for a thoroughly discussion on this methodology. While most case studies reported in the literature focus on the optimization of one single quality characteristic of process or product, the variety of real-life problems requires the consideration of multiple quality characteristics (objectives; responses). This fact and the researchers’ desire to propose enhanced techniques using recent advancements in mathematical optimization, scientific computing and computer technology have been making the multiresponse optimization an active research field. New algorithms and methodologies have been developed and their diffusion into various disciplines has proceeded at a rapid pace. To date, researchers are paying great attention to hybrid approaches to avoid premature algorithm convergence toward a local maximum or minimum and reach the global optimum in problems with multiple responses [24]. The readers are referred to Younis and Dong [25] for a review on historical development, special features and trends on the development of global optimization algorithms. These authors also examine and compare a number of representatives and recently introduced global optimization techniques. The issue is that the level of computational and mathematical or statistical expertise required for using those algorithms or methodologies and solving such problems successfully is significant. This makes such sophisticated tools hard to adopt, in particular, by practitioners [1]. A strategy widely used for optimizing multiple responses in the RSM framework consists of converting the multiple responses into a single (composite) function followed by its optimization, using either the generalized reduced gradient or sequential quadratic programming algorithms available in the popular Microsoft ExcelÒ (Solver add-in) and MatlabÒ (fmincon routine), respectively. To form that composite function, the desirability function-based and loss functionbased methods are the most popular among practitioners. The existing methods use distinct composite functions to provide indication about how close the response values are from their target, but the widely used desirability-based methods do not consider the responses’ variance level and correlation information, and the composite function does not give information on it to the analyst. What the analyst knows is that either a higher or a lower value is preferred, depending on how the composite function is defined. The composite functions of the loss function-based methods present the result in monetary terms, that is, the compromise solution is expressed by a monetary loss that must be as low as possible, and some of those composite functions consider the variance– covariance structure of responses. However, the composite functions of loss- and desirability-based methods have a serious drawback. They may give inconsistent results, namely, different results for the same responses values (compromise solution). This may confound the analyst and difficult the evaluation of compromise solutions as he/she needs to check the values of each response considered in the study to identify a compromise solution of interest. This difficulty increases as larger the number of available solutions and responses are. So, the authors propose optimization performance measures with a threefold purpose:

36

Assessing Response’s Bias, Quality of Predictions, and Robustness

447

I. Provide relevant information to the analyst so that he/she may achieve compromise solutions of interest and feasible in practice whenever a method that does not consider in the objective function the variance–covariance structure of responses is used. II. Help the analyst in evaluating the feasibility of compromise solutions by assessing the response’s bias (responses deviation from their target), quality of predictions (variance due to uncertainty in the regression coefficients of predicted responses) and robustness (variance due to uncontrollable variables) separately. III. Allow the evaluation of methods solutions that cannot be compared directly due to the different approaches subjacent to those methods, for example, loss function and desirability function approaches. The feasibility of the proposed measures is illustrated through desirability and loss function-based methods by using two examples from the literature. The remaining sections are structured as follows: next section provides a review on analysis methods. Then optimization measures are introduced. The subsequent sections include the examples and the results discussion, respectively. Conclusion and future work are presented in the last section.

36.2 Methods for Multiresponse Analysis The desirability function-based and loss function-based methods are the most popular ones among practitioners who, in the RSM framework, look for optimum variable settings for the process and product whenever multiple responses are considered simultaneously. Therefore, the methods that are widely used in practice and will serve to illustrate the feasibility of optimization measures are reviewed below. Many other alternative approaches are available in the literature, and reviews on them are provided by Gauri and Pal [8], Kazemzadeh et al. [10], Murphy et al. [16].

36.2.1 Desirability-Based Methods The desirability-based methods are easy to understand, flexible for incorporating the decision-maker’s preferences (priority to responses), and the most popular of them, the so-called Derringer and Suich’s method [6], or modifications of it [5], is available in many data analysis software packages. However, to use this method the analyst needs to assign values to four shape parameters (weights). This is not a simple task and makes an impact on the optimal variable settings. An alternative desirability-based method that, under the assumptions of normality and homogeneity of error variances, requires minimum information from the user was

448

N. Costa et al.

proposed by Ch’ng et al. [2]. The method they proposed is easy to understand and implement in the readily available Microsoft ExcelÒ—Solver tool and, in addition, requires less cognitive effort from the analyst. The user only has to assign values to one type of shape parameters (weights), which is a relevant advantage over the extensively used Derringer and Suich’s method. Ch’ng et al. [2] suggested individual desirability functions of the form d¼

2^y ðU þ LÞ 2^y 2L þ1¼ þ ¼ m^y þ c UL UL UL

ð36:1Þ

where 0 d 2 and ^y represents the response’s model with upper and lower bounds defined by U and L, respectively. The global desirability (composite) function is defined as !, p X D¼ ei jdi di ðhi Þj p ð36:2Þ i¼1

where di ðhi Þ is the value of the individual desirability function i at the target value hi , ei is the weight (degree of P importance or priority) assigned to response i, p is the number of responses, and pi¼1 ei ¼ 1. The aim is to minimize D. Although Ch’ng et al. illustrate their method only for nominal-the-best (NTB—the value of the estimated response is expected to achieve a particular target value) response type, in this article the larger-the-best (LTB—the value of the estimated response is expected to be larger than a lower bound) and smallerthe-best (STB—the value of the estimated response is expected to be smaller than an upper bound) response types are also considered. In these cases, di ðUi Þ and di ðLi Þ are used in Eq. 36.2 instead of di ðhi Þ, under the assumption that it is possible to establish the specification limits U and L to those responses based on product knowledge or practical experience. To use the maximum or minimum value of the response model is also an alternative. A limitation in this method is that it does not consider the quality of predictions and robustness in the optimization process.

36.2.2 Loss Function-Based Methods The loss function approach uses a totally different idea about the multi-response optimization by considering monetary aspects in the optimization process. This approach is very popular among the industrial engineering community and, unlike the above-mentioned desirability-based methods, there are loss function-based methods that consider the responses’ variance level and exploit the responses’ correlation information, which is statistically sound. Examples of those methods were introduced by Vining [21] and Lee and Kim [13]. Vining [21] proposed a loss function-based method that allows specifying the directions of economic importance for the compromise optimum, while seriously

36

Assessing Response’s Bias, Quality of Predictions, and Robustness

449

considering the variance–covariance structure of the expected responses. This method aims at finding the variable settings that minimize an expected loss function defined as h X i ðxÞ ð36:3Þ E½Lð^yðxÞ; hÞ ¼ ðE½^yðxÞ hÞT C ðE½^yðxÞ hÞ þ trace C ^y P where ^y ðxÞis the variance–covariance matrix of the predicted responses at x and C is a cost matrix related to the costs of non-optimal design. If C is a diagonal matrix then each element represents the relative importance assigned to the corresponding response, that is, the penalty (cost) incurred for each unit of response value deviated from its optimum. If C is a non-diagonal matrix, the off-diagonal elements represent additional costs incurred when pairs of responses are simultaneously off-target. The first term in Eq. 36.3 represents the penalty due to the deviation from the target; the second term represents the penalty due to the quality of predictions. Lee and Kim [13], such as Pignatiello [19] and Wu and Chyu [22], emphasize the bias reduction and the robustness improvement. They proposed minimizing an expected loss defined as E½LðyðxÞ; hÞ ¼

p p X i1 h i X X ^ ij þ ð^yi hi Þð^yj hj Þ ^2i þ ci ð^yi hi Þ2 þ r cij r i¼1

i¼2 j¼1

ð36:4Þ ^2i and r ^ ij are elements where ci and cij represent weights (priorities or costs), and P r of the response’s variance–covariance structure at x ð y ðxÞÞ. A key difference between Eqs. 36.3 and 36.4 is that the later uses the variance–covariance structure of the responses rather than the variance–covariance structure of the predicted responses. Moreover, Lee and Kim’s method requires replicates at each design run, which will certainly increase the time and cost of experimentation. This is not problematic only if the variance due to uncontrollable variables is a trouble in practice. A difficulty with all loss function-based methods is to take into account different scales, relative variabilities and relative costs in matrix C [12, 21].

36.3 Measures of Optimization Performance To evaluate the feasibility of compromise solutions in multiresponse problems, the analyst needs information about the response’s properties at ‘‘optimal’’ variable settings, namely the bias and variance. In fact, responses at some variable settings may have considerable variance due to the uncertainty in the regression coefficients of predicted responses and are sensitive to uncontrollable variables that may be significant and, therefore, cannot be ignored.

450

N. Costa et al.

In the RSM framework few authors have addressed the evaluation of response’s properties to the extent it deserves. In general they focus on the output of the objective function they use. Authors that compare the performance of several methods by evaluating the responses properties at variable settings through optimization performance measures are Lee and Kim [13], Ko et al. [12] and Xu et al. [23]. While Lee and Kim [13] and Ko et al. [12] use the terms or components of the objective function they propose for comparing the results of loss functionbased methods in terms of the desired response’s properties, Xu et al. [23] propose new optimization performance measures. The major shortcomings in the proposals of previous authors are the following: 1. The optimization measures used by Lee and Kim [13] and Ko et al. [12] require the definition of a cost matrix, which is not easy to define or readily available. 2. The optimization measures used by Xu et al. [23] only allow the evaluation of response’s bias. To compare methods results or compromise solutions in multiresponse optimization problems it is necessary to consider the statistical properties of the methods used in addition to response’s bias and variance. In fact, optimization methods may differ in terms of statistical properties and optimization schemes so the evaluation and comparison of the corresponding solutions in a straightforward manner may not be possible. For example, the global desirability values of methods that either minimize or minimize the global desirability are neither comparable directly nor with the result (monetary loss) achieved from a loss function-based method. With the aim at providing useful information to the analyst or decision-maker concerning to desired response’s properties (bias, quality of predictions, and robustness) and to evaluate the solutions obtained from different methods optimization measures are proposed. Those measures allow the separate assessment of the bias, quality of predictions, and robustness, which may help the analyst in achieving a solution of interest and guiding him/her during the optimization process, in particular when quality of predictions and robustness are important issues in practice. In addition, they may also serve to evaluate the solutions obtained from different methods and help the practitioner or researcher in making a more informed decision when he/she is interested in choosing a method for optimizing multiple responses. To assess the method’s solutions in terms of bias, it is suggested an optimization measure that considers the response types, response’s specification limits and deviation of responses from their target. This measure, named cumulative bias (Bcum), is defined as Bcum ¼

p X

Wi ^yi hi

ð36:5Þ

i¼1

where ^yi represents the estimated response value at ‘‘optimal’’ variable settings, hi is the target value and Wi is a parameter that takes into account the

36

Assessing Response’s Bias, Quality of Predictions, and Robustness

451

specification limits and response type of the ith response. This parameter is defined as follows: W ¼ 1=ðU LÞ for STB and LTB response types; W ¼ 2=ðU LÞ for NTB response type. The cumulative bias gives an overall result of the optimization process instead of focusing on the value of a single response, which prevents unreasonable decisions of being taken in some cases [11]. To assess the bias of each response, the practitioner may use the individual bias (Bi) defined as Bi ¼ Wi ^yi hi ð36:6Þ Alternatives to Bcum and Bi are presented by Xu et al. [23]. These authors utilize Wi ¼ 1=hi and consider the mean value for the cumulative bias. For the individual bias they consider Wi ¼ 1. The measure proposed for assessing method’s solutions in terms of quality of predictions is defined by h h X i 1 i T T 1 ð36:7Þ QoP ¼ trace u ðxÞ ¼ trace u x X Q X xj j ^y where xj is the subset of independent variables consisting of the K 9 1 vector of regressors for the ith response with N observationsP on Ki regressors for response, P is X is an Np 9 K block diagonal matrix and Q ¼ IN . An estimate of ^ij ¼ ^eTj ^ej =N, where ê is the residual vector from the OLS estimation of the r response P i; IN is an identity matrix and represents the Kronecker product. To make ^y ðxÞ dimensionless this matrix is multiplied by matrix u, whose diagonal and non-diagonal elements are uii ¼ 1=ðUi Li Þ2 and uir ¼ 1=ðUi Li ÞðUr Lr Þ for i 6¼ r, respectively. QoP is defined under the assumption that seemingly unrelated regression (SUR) method is employed to estimate the regression models (response surfaces) as it yields regression coefficients at least as accurate as those of other popular regression techniques, namely the ordinary and generalized least squares [7, 20]. If the ordinary least squares is used the reader is referred to Vining [21] as this author presents variants of Eq. 36.7 for the case of regression models with equal and different forms. The robustness is assessed by h X i ðxÞ Rob ¼ trace u y

ð36:8Þ

P where y ðxÞ represents the variance–covariance matrix of the ‘‘true’’ responses. Note that replications of the experimental runs are required to assess the robustness and matrix (only considers the specification limits of the variance models, while Lee and Kim [13] and Ko et al. [12] use a cost matrix (C). Although the replicates increase the time and cost of experimentation, they may provide significant improvements in robustness that overbalance or at least compensate the time and cost spent.

452

N. Costa et al.

Bi, Bcum, QoP and Rob are dimensionless ratios, so the worry with the dimensional consistency of responses is cancelled. These measures do not exclude others from being used as well and, in terms of results, the lower their values are, the better the compromise solution will be. In practice, all the proposed measures take values greater than or equal to zero, but zero is the most favorable.

36.4 Examples Two examples from the literature illustrate the utility of the proposed performance measures. The first one considers a case study where the quality of prediction is the adverse condition. In this example the methods introduced by Ch’ng et al. [2] and Vining [21] are used. The second one considers the robustness as adverse condition. In this case the methods introduced by Ch’ng et al. [2] and Lee and Kim [13] are used. Example 1 The responses specification limits and targets for the percent conversion (y1 ) and thermal activity (y2 ) of a polymer are the following: ^y1 80:00 with U1 ¼ h1 ¼ 100; 55:00 ^y2 60:00 with h2 ¼ 57:50. Reaction time (x1), reaction temperature (x2), and amount of catalyst (x3) are the control factors. According to Myers and Montgomery [17], the objective was to maximize the percent conversion and achieve the nominal value for the thermal activity. A central composite design with six axial and six center points, with 1:682 xi 1:682, was run to generate the data. The predicted responses, fitted by the SUR method, are as follows: ^y1 ¼ 81:09 þ 1:03x1 þ 4:04x2 þ 6:20x3 1:83x21 þ 2:94x22 5:19x23 þ 2:13x1 x2 þ 11:38x1 x3 3:88x2 x3 ^y2 ¼ 59:85 þ 3:58x1 þ 0:25x2 þ 2:23x3 0:83x21 þ 0:07x22 0:06x23 0:39x1 x2 0:04x1 x3 þ 0:31x2 x3 The model of the thermal activity includes some insignificant regressors (x2 , x21 , x23 , x1 x2 , x1 x3 , x2 x3 ), so the predicted response has a poor quality of prediction. In particular, this estimated response will have a variance as larger as farther from the origin the variable settings are. The variance–covariance matrix is estimated as 11:12 0:55 ^ R¼ 0:55 1:55

x22 ,

. As regards the results, Table 36.1 shows that the global desirability function (D) yields different values for the same response values (cases I and III). This is not desirable or reasonable and may confound analysts who are focused on D value for making decisions. In contrast, the Bcum and QoP remain unchanged, as it is expectable in these instances. By using these measures the analyst can easily

36

Assessing Response’s Bias, Quality of Predictions, and Robustness

Table 36.1 Results: Example 1 Ch’ng et al.

Vining

Case I

Case II

Case III

Weights

(0.30, 0.70)

(0.50, 0.50)

(0.60, 0.40)

xi

(-0.544, 1.682, -0.599) (95.19, 57.50) D = 0.14 0.24 (0.24, 0.00) 0.08

(-1.682, 1.682, -1.059) (98.04, 55.00) D = 0.35 1.10 (0.10, 1.00) 0.31

(-0.538, 1.682, -0.604) (95.19, 57.50) D = 0.29 0.24 (0.24, 0.00) 0.08

^yi Result Bcum Bi QoP

453

0:100 0:025 0:025 0:500 (-0.355, 1.682, -0.468) (95.24, 58.27) E(loss) = 3.86 0.55 (0.24, 0.31) 0.06

perceive whether the changes he/she made in the weights are either favorable or unfavorable in terms of response values. When Bcum or QoP increase, it means that the changes made in the weights are unfavorable, that is, the value of at least one of the responses is farther from its target, as it is the case of ^y2 in the Vining’s solution, or the quality of predictions is lower, such as occur in case II. Case II serves to illustrate that the analyst can distinguish solutions with larger variability from other(s) with smaller variability, for example the cases I and III, looking at QoP value. Vining’s solution is the best in terms of the QoP value, because x1 and x3 values are slightly closer to the origin than in the other cases, namely the cases I and III, which present the same value of QoP. These results provide evidence that the proposed measures give better indications (information) to the analyst and can help him/her in achieving feasible solutions if the quality of predictions is an adverse condition. Example 2 Lee and Kim [13] assumed that the fitted response functions for process mean, variance and covariance of two quality characteristics are as follows: ^y1 ¼ 79:04 þ 17:74x1 þ 0:62x2 þ 14:79x3 0:70x21 10:95x22 0:10x23 5:39x1 x2 þ 1:21x1 x3 1:79x2 x3 ^1 ¼ 4:54 þ 3:92x1 þ 4:29x2 þ 1:66x3 þ 1:15x21 þ 4:40x22 þ 0:94x23 þ 3:49x1 x2 r þ 0:74x1 x3 þ 1:19x2 x3 ^y2 ¼ 400:15 95:21x1 28:98x2 55:99x3 þ 20:11x21 þ 26:80x22 þ 10:91x23 þ 57:13x1 x2 3:73x1 x3 10:87x2 x3 ^2 ¼ 26:11 1:34x1 þ 6:71x2 þ 0:37x3 þ 0:77x21 þ 2:99x22 0:97x23 r 1:81x1 x2 þ 0:41x1 x3

454

N. Costa et al.

Table 36.2 Results: Example 2 Lee and Kim Case I

Ch’ng et al.

Case II

Case III

Weights (1, 1, 1)

(0.3, 0.5, 0.02)

(0.8, 0.3, 1.0)

xi

(0.79, -0.76, 1.00)

(0.80, -0.77, 1.00)

^yi Var– cov Result Bcum Bi

(97.86, 301.40) (7.80, 22.96, 6.39)

(98.06, 300.32) (7.84, 22.98, 6.39)

(1.00, -1.00, 0.43) (74.22, 346.45) (5.89, 23.12, 4.35)

(98.18, 300.00) (7.86, 22.99, 6.38)

E(loss) = 598.1 1.76 (0.05, 0.78, 0.01, 0.92) 0.053

E(loss) = 283.7 1.75 (0.05, 0.78, 0.00, 0.92) 0.053

E(loss) = 173.9 2.39 (0.64, 0.59, 0.23, 0.92) 0.036

D = 0.53 1.75 (0.05, 0.79, 0.00, 0.92) 0.053

Rob

(0.25, 0.25, 0.15, 0.35) (0.80, -0.75, 1.00)

^12 ¼ 5:45 0:77x1 þ 0:16x2 þ 0:49x3 0:42x21 þ 0:50x22 0:35x23 0:63x1 x2 r þ 1:13x1 x3 0:30x2 x3 In this example it is assumed that the response’s specifications are: ^y1 60 with ^1 10 with L1 ¼ h1 ¼ 0; r ^2 25 U1 ¼ h1 ¼ 100; ^y2 500 with L2 ¼ h2 ¼ 300; r with L2 ¼ h2 ¼ 0; 1 xi 1. As regards the results, Table 36.2 shows that the loss function proposed by Lee and Kim yields different expected loss values for solutions with marginal differences in the response values (cases I and II). In contrast, the Bcum value remains unchanged in these situations, confirming its utility for assessing compromise solutions for multiresponse problems. Moreover, note that the lowest expected loss value is obtained from a solution with the worse values for ^y1 and ^y2 , such as occurs in case III, what is an absurdity. Nevertheless, this example provides evidence that the analyst can recognize more robust solutions (case III) from others with larger variability due to uncontrollable factors (case I, II, and Ch’ng et al.’s solution) looking at the Rob value. Note that Ch’ng et al.’s method yields a solution similar to the cases I and II when appropriate weights are assigned to ^y2 ^2 , remaining unchanged (equal to 0.25) the weights to ^y1 and r ^1 . and r This example confirms that the proposed measures give useful information to the analyst and can help him/her in achieving feasible solutions if the robustness is an adverse condition.

36.5 Discussion The optima are stochastic by nature and understanding the variability of responses is a critical issue for the practitioners. Thus, the assessment of the responses’ sensitivity to uncontrollable factors in addition to estimated responses’ variance

36

Assessing Response’s Bias, Quality of Predictions, and Robustness

455

level at ‘‘optimal’’ variable settings by appropriate measures provides the required information for the analyst evaluating compromise solutions in multiresponse optimization problems. For this purpose the QoP and Rob measures are introduced, in addition to measures for assessing the response’s bias (Bi and Bcum). The previous examples show that the expected loss and global desirability functions may give inconsistent and incomplete information to the analyst about methods solutions, namely in terms of the merit of the final solution and desired responses properties. This is a relevant shortcoming, which is due to the different weights or priorities assigned to responses that are considered in the composite function. Those composite functions yield different results in cases where the solutions are equal or have slightly changes in the response values, such as illustrated in Examples 1and 2. Example 2 also shows that absurd results may occur in loss functions if the elements of matrix C are not defined properly. In fact, the loss coefficients (cij ) play a major role in the achievement of optimal parameter conditions that result in trade-offs of interest among responses [22]. In particular, the non-diagonal elements represent incremental costs incurred when pairs of responses are simultaneously off-target, and have to satisfy theoretical conditions that the practitioner may not be aware of or take into account. Those conditions for symmetric loss functions are: c11 ; c22 0 and 2c11 c22 c12 c11 c22 . When these conditions are not satisfied, worse solutions may produce spuriously better (lower) values in the loss function, as it was illustrated with case III in Example 2. Wu and Chyu [22] provide guidelines for defining the cij for symmetric and asymmetric loss functions, but additional subjective information is required from the analyst. Therefore, if the analyst only focuses on the result of the composite function used for making decisions he/she may ignore a solution of interest or be confounded about the directions for changing weights or priorities to responses as the composite function may give unreliable information. By using the proposed measures the analyst does not have to worry with the reliability of the information as they do not depend on priorities assigned to responses. By this reason, the proposed measures may also serve to compare the performance of methods that use different approaches, for example, between desirability function-based methods and loss functionbased methods. Similarly, they make possible the comparison between methods structured under the same approach but that use different composite functions, as it is the case of Derringer and Suich’s method, where the composite function is a multiplicative function, which must be maximized, and Ch’ng et al.’s method, where the composite function is an additive function, which must be minimized. From a theoretical point of view, methods that consider the responses’ variance level and exploit the responses’ correlation information lead to solutions that are more realistic when the responses have either significantly different variance levels or are highly correlated [12]. However, the previous examples show that the proposed measures can provide useful information to the analyst so that he/she achieves compromise solutions with desired properties at ‘‘optimal’’ settings by using

456

N. Costa et al.

methods that do not consider in the objective function the variance–covariance structure of responses. Nevertheless, it is important to highlight that points in non-convex response surfaces cannot be captured by weighted sums like those represented by the objective functions reviewed here, even if the proposed measures are used. Publications where this and other method’s limitations are addressed include Das and Dennis [4], Mattson and Messac [14]. This means that the proposed optimization measures are not the panacea to achieve optimal solutions. In fact, Messac et al. [15] demonstrated that the ability of an objective function to capture points in convex and concave surfaces depends on the presence of parameters that the analyst can use to manipulate the composite function’s curvature. Although they show that using exponents to assign priorities to responses is a more effective practice to capture points in convex and highly concave surfaces, assigning weights to responses is a critical task in multiresponse problems. It usually involves an undefined trial-and-error weight-tweaking process that may be a source of frustration and significant inefficiency, particularly when the number of responses and control factors is large. So, the need for methods where minimum subjective information is required from the analyst is apparent. A possible choice is the method proposed by Costa [3]. According to this author, besides the low number of weights required from the analyst, the method he proposes has three major characteristics: effectiveness, simplicity and application easiness. This makes the method appealing to use in practice and support the development of an iterative procedure to achieve compromise solutions to multiresponse problems in the RSM framework. Despite the potential usefulness of interactive procedures in finding compromise solutions of interest, due attention has not been paid to procedures that facilitate the preference articulation process for multiresponse problems in the RSM framework.

36.6 Conclusion and Future Work Low bias and minimum variance are desired response’s properties at optimal variable settings in multiresponse problems. Thus, optimization performance measures that can be utilized with the existing methods to facilitate the evaluation of compromise solutions in terms of the desired response’s properties are proposed. They can be easily implemented by the analysts and allow the separate assessment of the bias, quality of predictions, and robustness of those solutions. This is useful as the analyst can explore the method’s results putting emphasis on the property(ies) of interest. In fact, compromise solutions where some responses are more favorable than others in terms of bias, quality of predictions or robustness may exist. In these instances, the analyst has relevant information available to assign priorities to responses and make a more informed decision based on economical and technical considerations. As the assignment of priorities to responses is an open research field, an iterative procedure that considers the results of the proposed optimization measures arises as an interesting research topic.

36

Assessing Response’s Bias, Quality of Predictions, and Robustness

457

References 1. Ayvaz M, Tamer K, Ali H, Ceylan H, Gurarslan G (2009) Hybridizing the harmony search algorithm with a spreadsheet ‘Solver’ for solving continuous engineering optimization problems. Eng Optim 41(12):1119–1144 2. Ch’ng C, Quah S, Low H (2005) A new approach for multiple response optimization. Qual Eng 17(4):621–626 3. Costa N (2010) Simultaneous optimization of mean and standard deviation. Qual Eng 22(3):140–149 4. Das I, Dennis J (1997) A closer look at drawbacks of minimizing weighted sums of objectives for pareto set generation in multicriteria optimization problems. Struct Optim 14(1):63–69 5. Derringer G (1994) A balancing act: optimizing product’s properties. Qual Prog 24:51–58 6. Derringer G, Suich R (1980) Simultaneous optimization of several response variables, J Qual Tech 12(4):214–218 7. Fogliatto F, Albin L (2000) Variance of predicted response as an optimization criterion in multiresponse experiments. Qual Eng 12(4):523–533 8. Gauri S, Pal S (2010) Comparison of performances of five prospective approaches for the multi-response optimization. Int J Adv Manuf Technol 48(12):1205–1220 9. Goh T (2009) Statistical thinking and experimental design as dual drivers of DFSS. Int J Six Sigma Compet Adv 5(1):2–9 10. Kazemzadeh R, Bashiri M, Atkinson A, Noorossana R (2008) A general framework for multiresponse optimization problems based on goal programming. Eur J Oper Res 189(2):421–429 11. Kim K, Lin D (2000) Simultaneous optimization of multiple responses by maximining exponential desirability functions. Appl Stat Ser C 49(3):311–325 12. Ko Y, Kim K, Jun C (2005) A new loss function-based method for multiresponse optimization. J Qual Tech 37(1):50–59 13. Lee M, Kim Y (2007) Separate response surface modeling for multiple response optimization: multivariate loss function approach. Int J Ind Eng 14(2):227–235 14. Mattson C, Messac A (2003) Concept selection using s-Pareto frontiers. AIAA J 41(6):1190– 1198 15. Messac A, Sundararaj G, Tappeta R, Renaud J (2000) Ability of objective functions to generate points on non-convex pareto frontiers. AIAA J 38(6):1084–1091 16. Murphy T, Tsui K, Allen J (2005) A review of robust design methods for multiple responses. Res Eng Des 15(4):201–215 17. Myers R, Montgomery D (2002) Response surface methodology: process and product optimization using designed experiments, 2nd edn. Wiley, New Jersey 18. Myers R, Montgomery D, Anderson-Cook C (2009) Response surface methodology: process and product optimization using designed experiments, 3rd edn. Wiley, New York 19. Pignatiello J (1993) Strategies for robust multiresponse. IIE Trans 25(1):5–15 20. Shah H, Montgomery D, Matthew W (2004) Response surface modeling and optimization in multiresponse experiments using seemingly unrelated regressions. Qual Eng 16(3):387–397 21. Vining G (1998) A compromise approach to multiresponse optimization. J Qual Tech 30(4):309–313 22. Wu F, Chyu C (2004) Optimization of robust design for multiple quality characteristics. Int J Prod Res 42(2):337–354 23. Xu K, Lin D, Tang L, Xie M (2004) Multiresponse systems optimization using a goal attainment approach. IIE Trans 36(5):433–445 24. Yildiz A (2009) A new design optimization framework based on immune algorithm and Taguchi’s method. Comput Ind 60(8):613–620 25. Younis A, Dong Z (2010) Trends, features, and tests of common and recently introduced global optimization methods. Eng Optim 42(8):691–718

Chapter 37

Inspection Policies in Service of Fatigued Aircraft Structures Nicholas A. Nechval, Konstantin N. Nechval and Maris Purgailis

Abstract Fatigue is one of the most important problems of aircraft arising from their nature as multiple-component structures, subjected to random dynamic loads. For guaranteeing safety, the structural life ceiling limits of the fleet aircraft are defined from three distinct approaches: Safe-Life, Fail-Safe, and Damage Tolerance approaches. The common objectives to define fleet aircraft lives by the three approaches are to ensure safety while at the same time reducing total ownership costs. In this paper, the damage tolerance approach is considered and the focus is on the inspection scheme with decreasing intervals between inspections. The paper proposes an analysis methodology to determine appropriate decreasing intervals between inspections of fatigue-sensitive aircraft structures (as alternative to constant intervals between inspections often used in practice), so that risk of catastrophic accident during flight is minimized. The suggested approach is unique and novel in that it allows one to utilize judiciously the results of earlier inspections of fatigued aircraft structures for the purpose of determining the time of the next inspection and estimating the values of several parameters involved in the problem that can be treated as uncertain. Using in-service damage data and taking into account safety risk and maintenance cost at the same time, the above approach has been proposed to assess

N. A. Nechval (&) Department of Statistics, EVF Research Institute, University of Latvia, Raina Blvd 19, Riga, LV-1050, Latvia e-mail: [email protected] K. N. Nechval Department of Applied Mathematics, Transport and Telecommunication Institute, Lomonosov Street 1, Riga, LV-1019, Latvia e-mail: [email protected] M. Purgailis Department of Cybernetics, University of Latvia, Raina Blvd 19, Riga, LV-1050, Latvia e-mail: [email protected] S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_37, Springer Science+Business Media B.V. 2011

459

460

N. A. Nechval et al.

the reliability of aircraft structures subject to fatigue damage. An illustrative example is given.

37.1 Introduction In spite of decades of investigation, fatigue response of materials is yet to be fully understood. This is partially due to the complexity of loading at which two or more loading axes fluctuate with time. Examples of structures experiencing such complex loadings are automobile, aircraft, off-shores, railways and nuclear plants. While most industrial failures involve fatigue, the assessment of the fatigue reliability of industrial components being subjected to various dynamic loading situations is one of the most difficult engineering problems. This is because material degradation processes due to fatigue depend upon material characteristics, component geometry, loading history and environmental conditions. The traditional analytical method of engineering fracture mechanics (EFM) usually assumes that crack size, stress level, material property and crack growth rate, etc. are all deterministic values which will lead to conservative or very conservative outcomes. However, according to many experimental results and field data, even in well-controlled laboratory conditions, crack growth results usually show a considerable statistical variability [1]. The analysis of fatigue crack growth is one of the most important tasks in the design and life prediction of aircraft fatigue-sensitive structures (for instance, wing, fuselage) and their components (for instance, aileron or balancing flap as part of the wing panel, stringer, etc.). Several probabilistic or stochastic models have been employed to fit the data from various fatigue crack growth experiments. Among them, the Markov chain model [2], the second-order approximation model [3], and the modified secondorder polynomial model [4]. Each of the models may be the most appropriate one to depict a particular set of fatigue growth data but not necessarily the others. All models can be improved to depict very accurately the growth data but, of course, it has to be at the cost of increasing computational complexity. Yang’s model [3] and the polynomial model [4] are considered more appropriate than the Markov chain model [2] by some researchers through the introduction of a differential equation which indicates that fatigue crack growth rate is a function of crack size and other parameters. The parameters, however, can only be determined through the observation and measurement of many crack growth samples. Unfortunately, the above models are mathematically too complicated for fatigue researchers as well as design engineers. A large gap still needs to be bridged between the fatigue experimentalists and researchers who use probabilistic methods to study the fatigue crack growth problems. Airworthiness regulations require proof that aircraft can be operated safely. This implies that critical components must be replaced or repaired before safety is

37

Inspection Policies in Service of Fatigued Aircraft Structures

461

compromised. For guaranteeing safety, the structural life ceiling limits of the fleet aircraft are defined from three distinct approaches: Safe-Life, Fail-Safe, and Damage-Tolerant approaches. The common objectives to define fleet aircraft lives by the three approaches are to ensure safety while at the same time reducing total ownership costs. Although the objectives of the three approaches are the same, they vary with regard to the fundamental definition of service life. The Safe-Life approach is based on the concept that significant damage, i.e. fatigue cracking, will not develop during the service life of a component. When the service life equals the design Safe-Life the component must be replaced. The Fail-Safe approach assumes initial damage as manufactured and its subsequent growth during service to detectable crack sizes or greater. Service life in Fail-Safe structures can thus be defined as the time to a service detectable damage. However, there are two major drawbacks to the Safe-Life and Fail-Safe approaches: (1) components are taken out of service even though they may have substantial remaining lives; (2) despite all precautions, cracks sometimes occur prematurely. These facts led the Airlines to introduce the Damage Tolerance approach, which is based on the concept that damage can occur and develop during the service life of a component. In this paper, the Damage Tolerance approach is considered and the focus is on the inspection scheme with decreasing intervals between inspections. From an engineering standpoint the fatigue life of a component or structure consists of two periods: (i) crack initiation period, which starts with the first load cycle and ends when a technically detectable crack is present, and (ii) crack propagation period, which starts with a technically detectable crack and ends when the remaining cross section can no longer withstand the loads applied and fails statically. Periodic inspections of aircraft are common practice in order to maintain their reliability above a desired minimum level. The appropriate inspection intervals are determined so that the fatigue reliability of the entire aircraft structure remains above the minimum reliability level throughout its service life.

37.2 Inspection Scheme Under Fatigue Crack Initiation At first, we consider in this section the problem of estimating the minimum time to crack initiation (warranty period or time to the first inspection) for a number of aircraft structure components, before which no cracks (that may be detected) in materials occur, based on the results of previous warranty period tests on the structure components in question. If in a fleet of k aircraft there are km of the same individual structure components, operating independently, the length of time until the first crack initially formed in any of these components is of basic interest, and provides a measure of assurance concerning the operation of the components in question. This leads to the consideration of the following problem. Suppose we have observations X1, …, Xn as the results of tests conducted on the components; suppose also that there are km components of the same kind to be put into future

462

N. A. Nechval et al.

use, with times to crack initiation Y1, …, Ykm. Then we want to be able to estimate, on the basis of X1, …, Xn, the shortest time to crack initiation Y(1, km) among the times to crack initiation Y1, …, Ykm. In other words, it is desirable to construct lower simultaneous prediction limit, Lc, which is exceeded with probability c by observations or functions of observations of all k future samples, each consisting of m units. In this section, the problem of estimating Y(1,km), the smallest of all k future samples of m observations from the underlying distribution, based on an observed sample of n observations from the same distribution, is considered. Assigning the time interval until the first inspection. Experiments show that the number of flight cycles (hours) at which a technically detectable crack will appear in a fatigue-sensitive component of aircraft structure follows the two-parameter Weibull distribution. The probability density function for the random variable X of the two-parameter Weibull distribution is given by " # d1 d d x x exp ðx [ 0Þ; ð37:1Þ f ðxjb; dÞ ¼ b b b where d [ 0 and b [ 0 are the shape and scale parameters, respectively. The following theorem is used to assign the time interval until the first inspection (warranty period). Theorem 1 (Lower one-sided prediction limit for the lth order statistic of the Weibull distribution). Let X1 \ \ Xr be the first r ordered past observations from a sample of size n from the distribution (37.1). Then a lower one-sided conditional (1- a) prediction limit h on the lth order statistic Yl of a set of m future ordered observations Y1 \ \ Ym is given by _ _ _ _ PrfYl [ hjzg ¼ Pr d ln Yl = b [ d ln h= b jz ¼ PrfWl [ wh jzg " ! Z _P _ l1 X l 1 ð1Þl1j 1 r2 v d ri¼1 ln xi = b v e mj j 0 j¼0 !r # _ _ Xr v _d ln xi = _b v d ln xr = b vwh ðm jÞe þ þ ðn rÞe e dv i¼1 ¼

Xl1

"

j¼0 r X

_

e

! Z _P _ l 1 ð1Þl1j 1 r2 v d ri¼1 ln xi = b v e mj j 0 !r # _

v d ln xi = b

_

_

v d ln xr = d

þ ðn rÞe

dv

i¼1

¼ 1 a; _

_

ð37:2Þ

where b and d are the maximum likelihood estimators of b and d based on the first r ordered past observations (X1,…, Xr) from a sample of size n from the Weibull distribution, which can be found from solution of

37

Inspection Policies in Service of Fatigued Aircraft Structures

_

"

b¼

r X

_

_

463

#, !1=_d

xid þ ðn rÞxrd

ð37:3Þ

;

r

i¼1

and 2 _

d¼4

r X

_

!

_

xid ln xi þ ðn rÞxrd ln xr

i¼1

r X

_

_

xid þ ðn rÞxrd

i¼1

!1

r 1X

r

31 ln xi 5 ;

i¼1

ð37:4Þ _ _ Zi ¼ d ln Xi =b ;

z ¼ ðz1 ; z2 ; . . .; zr2 Þ; _

_

Wl ¼ d lnðYl =bÞ;

i ¼ 1; . . .; r 2; _

_

wh ¼ dln(h=bÞ:

ð37:5Þ ð37:6Þ

(Observe that an upper one-sided conditional a prediction limit h on the lth order statistic Yl may be obtained from a lower one-sided conditional (1-a) prediction limit by replacing 1 - a by a.) h

Proof The proof is given by Nechval et al. [5] and so it is omitted here.

Corollary 1.1 A lower one-sided conditional (1 - a) prediction limit h on the minimum Y1 of a set of m future ordered observations Y1 B B Ym is given by (

._ _ ._ PrfY1 [ hjzg ¼ Pr d ln Y1 b [ d ln h b jz _

R1 ¼

0

) ¼ PrfW1 [ wh jzg

_ Pr _ _ _ _ _ r P vr2 ev d i¼1 lnðxi =b Þ mevwh þ ri¼1 ev d lnðxi =b Þ þ ðn rÞev d lnðxr =b Þ dv _ Pr _ _ _ _ _ r R1 P r lnðxi =b Þ v d lnðxi =b Þ r2 ev d i¼1 þ ðn rÞev d lnðxr =b Þ dv i¼1 e 0 v

¼ 1 a:

ð37:7Þ

Thus, when l = 1 (37.2) reduces to formula (37.7). Theorem 2 (Lower one-sided prediction limit for the lth order statistic of the exponential distribution). Under conditions of Theorem 1, if d = 1, we deal with the exponential distribution, the probability density function of which is given by f ðxjbÞ ¼

1 x exp ðx [ 0Þ: b b

ð37:8Þ

464

N. A. Nechval et al.

Then a lower one-sided conditional (1 - a) prediction limit h on the lth order statistic Yl of a set of m future ordered observations Y1 \ \ Ym is given by Yl h jSb ¼ sb Pr Yl hjSb ¼ sb ¼ Pr Sb sb l1 X l1 1 ¼ PrfWl [ wh g ¼ ð1Þ j Bðl; m l þ 1Þ j¼0 j

1 ¼ 1 a: ðm l þ 1 þ jÞ½1 þ wl ðm l þ 1 þ jÞr

ð37:9Þ

where Wl ¼

Yl ; Sb

wh ¼

h ; sb

Sb ¼

r X

Xi þ ðm rÞXr :

ð37:10Þ

i¼1

Proof It follows readily from standard theory of order statistics that the distribution of the lth order statistic Yl from a set of m future ordered observations Y1 B B Ym is given by 1 ½Fðxl jbÞl1 ½1 Fðxl jbÞml dFðxl jbÞ; f ðyl jbÞdxl ¼ ð37:11Þ Bðl; m l þ 1Þ where FðxjbÞ ¼ 1 expðx=bÞ:

ð37:12Þ

The factorization theorem gives Sb ¼

r X

X i þ ðn rÞXr

ð37:13Þ

i¼1

sufficient for b. The density of Sb is given by 1 sb r1 gðsb jbÞ ¼ ; s exp CðrÞbr b b

sb 0:

ð37:14Þ

Since Yl, Sb are independent, we have the joint density of Yl and Sb as 1 1 1 sb =b ½1 exl =b l1 ½exl =b mlþ1 rþ1 sr1 f ðyl ; sb jbÞ ¼ : ð37:15Þ b e Bðl; m l þ 1Þ CðrÞ b Making the transformation wl = yl/sb, sb = sb, and integrating out sb, we find the density of Wl as the beta density l1 X l1 r ð1Þ j f ðwl Þ ¼ Bðl; m l þ 1Þ j¼0 j This ends the proof.

1 ½ðm l þ 1 þ jÞwl þ 1rþ1

; 0\wl \1:

ð37:16Þ h

37

Inspection Policies in Service of Fatigued Aircraft Structures

465

Corollary 2.1 A lower one-sided conditional (1 - a) prediction limit h on the minimum Y1 of a set of m future ordered observations Y1B BYm is given by

Pr Y1 hjSb ¼ sb ¼ Pr Y1 Sb h sb jSb ¼ sb ¼ PrfW1 [ wh g ¼ 1=ð1 þ mwh Þr ¼ 1 a:

ð37:17Þ

Example Consider the data of fatigue tests on a particular type of structural components (stringer) of aircraft IL-86. The data are for a complete sample of size r = n = 5, with observations of time to crack initiation (in number of 104 flight hours): X1 = 5, X2 = 6.25, X3 = 7.5, X4 = 7.9, X5 = 8.1. Goodness-of-fit testing. It is assumed that Xi, i = 1(1)5, follow the twoparameter Weibull distribution (37.1), where the parameters b and d are unknown. We assess the statistical significance of departures from the Weibull model by performing empirical distribution function goodness-of-fit test. We use the S statistic (Kapur and Lamberson [6]). For censoring (or complete) datasets, the S statistic is given by Pr1 S¼

i¼½r=2þ1

lnðxiþ1 =xi Þ Mi

Pr1 lnðxiþ1 =xi Þ i¼1

Mi

P4 lnðxiþ1 =xi Þ i¼3

¼P

4 i¼1

Mi

¼ 0:184;

ð37:18Þ

lnðxiþ1 =xi Þ Mi

where [r/2] is a largest integer B r/2, the values of Mi are given in Table 13 (Kapur and Lamberson [6]). The rejection region for the a level of significance is {S [ Sn;a}. The percentage points for Sn;a were given by Kapur and Lamberson [6]. For this example, S ¼ 0:184\Sn¼5;a¼0:05 ¼ 0:86:

ð37:19Þ

Thus, there is not evidence to rule out the Weibull model. The maximum likeli_

_

hood estimates of the unknown parameters b and d are b ¼ 7:42603 and d ¼ 7:9081; respectively. Warranty period estimation. It follows from (37.7) that n_ . _ _ . _ o PrfY1 [ hjzg ¼ Pr d ln Y1 b [ d ln h b jz ¼ PrfW1 [ wh jzg ¼ PrfW1 [ 8:4378; zg ¼ 0:0000141389=0:0000148830 ¼ 0:95

ð37:20Þ

and a lower 0.95 prediction limit for Y1 is h = 2.5549 (9104) flight hours, i.e., we have obtained the time interval until the first inspection (or warranty period) equal to 25,549 flight hours with confidence level c = 1 - a = 0.95. Inspection Policy after Warranty Period. Let us assume that in a fleet of m aircraft there are m of the same individual structure components, operating independently. Suppose an inspection is carried out at time sj, and this shows that initial crack (which may be detected) has not yet occurred. We now have to

466

N. A. Nechval et al.

schedule the next inspection. Let Y1 be the minimum time to crack initiation in the above components. In other words, let Y1 be the smallest observation from an independent second sample of m observations from the distribution (37.1). Then the inspection times can be calculated (from (37.23) using (37.22)) as _

_

sj ¼ b expðwsj =dÞ;

j 1;

ð37:21Þ

where it is assumed that s0 = 0, s1 is the time until the first inspection (or warranty period), wsj is determined from PrfY1 [ sj jY1 [ sj1 ; zg n_ ._ _ . __ . _ _ . _ o ¼ Pr d ln Y1 b [ d ln sj b d ln Y1 b [ d ln sj1 b ; z

¼ PrfW1 [ wsj jW1 [ wsj1 ; zg ¼ PrfW1 [ wsj jzg PrfW1 [ wsj1 jzg ¼ 1 a; ð37:22Þ where ._ _ W1 ¼ d ln Y1 b ; _

._ _ wsj ¼ d ln sj b ;

ð37:23Þ

_

b and d are the ML

For further volumes: http://www.springer.com/series/7818

Sio Iong Ao Len Gelman •

Editors

Electrical Engineering and Applied Computing

123

Editors Sio Iong Ao International Association of Engineers Unit 1, 1/F, 37-39 Hung To Road Kwun Tong Hong Kong e-mail: [email protected]

Len Gelman Applied Mathematics and Computing School of Engineering Cranfield University Cranfield UK e-mail: [email protected]

ISSN 1876-1100

e-ISSN 1876-1119

ISBN 978-94-007-1191-4

e-ISBN 978-94-007-1192-1

DOI 10.1007/978-94-007-1192-1 Springer Dordrecht Heidelberg London New York Ó Springer Science+Business Media B.V. 2011 No part of this work may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, microfilming, recording or otherwise, without written permission from the Publisher, with the exception of any material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Cover design: eStudio Calamar, Berlin/Figueres Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)

Preface

A large international conference in Electrical Engineering and Applied Computing was held in London, U.K., 30 June–2 July, 2010, under the World Congress on Engineering (WCE 2010). The WCE 2010 was organized by the International Association of Engineers (IAENG); the Congress details are available at: http://www.iaeng.org/WCE2010. IAENG is a non-profit international association for engineers and computer scientists, which was founded originally in 1968. The World Congress on Engineering serves as good platforms for the engineering community to meet with each other and exchange ideas. The conferences have also struck a balance between theoretical and application development. The conference committees have been formed with over two hundred members who are mainly research center heads, faculty deans, department heads, professors, and research scientists from over 30 countries. The conferences are truly international meetings with a high level of participation from many countries. The response to the Congress has been excellent. There have been more than one thousand manuscript submissions for the WCE 2010. All submitted papers have gone through the peer review process, and the overall acceptance rate is 57%. This volume contains fifty-five revised and extended research articles written by prominent researchers participating in the conference. Topics covered include Control Engineering, Network Management, Wireless Networks, Biotechnology, Signal Processing, Computational Intelligence, Computational Statistics, Internet Computing, High Performance Computing, and industrial applications. The book offers the state of the art of tremendous advances in electrical engineering and applied computing and also serves as an excellent reference work for researchers and graduate students working on electrical engineering and applied computing. Sio Iong Ao Len Gelman

v

Contents

1

Mathematical Modelling for Coal Fired Supercritical Power Plants and Model Parameter Identification Using Genetic Algorithms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Omar Mohamed, Jihong Wang, Shen Guo, Jianlin Wei, Bushra Al-Duri, Junfu Lv and Qirui Gao

2

Sequential State Computation Using Discrete Modeling . . . . . . . . Dumitru Topan and Lucian Mandache

3

Detection and Location of Acoustic and Electric Signals from Partial Discharges with an Adaptative Wavelet-Filter Denoising. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jesus Rubio-Serrano, Julio E. Posada and Jose A. Garcia-Souto

4

Study on a Wind Turbine in Hybrid Connection with a Energy Storage System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hao Sun, Jihong Wang, Shen Guo and Xing Luo

1

15

25

39

5

SAR Values in a Homogenous Human Head Model . . . . . . . . . . . Levent Seyfi and Ercan Yaldız

6

Mitigation of Magnetic Field Under Overhead Transmission Line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Adel Zein El Dein Mohammed Moussa

67

Universal Approach of the Modified Nodal Analysis for Nonlinear Lumped Circuits in Transient Behavior . . . . . . . . . Lucian Mandache, Dumitru Topan and Ioana-Gabriela Sirbu

83

7

53

vii

viii

8

Contents

Modified 1.28 Tbit/s (32 3 4 3 10 Gbit/s) Absolute Polar Duty Cycle Division Multiplexing-WDM Transmission Over 320 km Standard Single Mode Fiber . . . . . . . . . . . . . . . . . . . . . . . . . . . . Amin Malekmohammadi

9

Wi-Fi Wep Point-to-Point Links . . . . . . . . . . . . . . . . . . . . . . . . . J. A. R. Pacheco de Carvalho, H. Veiga, N. Marques, C. F. Ribeiro Pacheco and A. D. Reis

10

Interaction Between the Mobile Phone and Human Head of Various Sizes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Adel Zein El Dein Mohammed Moussa and Aladdein Amro

95

105

115

11

A Medium Range Gbps FSO Link . . . . . . . . . . . . . . . . . . . . . . . J. A. R. Pacheco de Carvalho, N. Marques, H. Veiga, C. F. Ribeiro Pacheco and A. D. Reis

12

A Multi-Classifier Approach for WiFi-Based Positioning System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jikang Shin, Suk Hoon Jung, Giwan Yoon and Dongsoo Han

135

Intensity Constrained Flat Kernel Image Filtering, a Scheme for Dual Domain Local Processing . . . . . . . . . . . . . . . . . . . . . . . Alexander A. Gutenev

149

13

14

15

16

17

Convolutive Blind Separation of Speech Mixtures Using Auditory-Based Subband Model . . . . . . . . . . . . . . . . . . . . . . . . . Sid-Ahmed Selouani, Yasmina Benabderrahmane, Abderraouf Ben Salem, Habib Hamam and Douglas O’Shaughnessy

125

161

Time Domain Features of Heart Sounds for Determining Mechanical Valve Thrombosis . . . . . . . . . . . . . . . . . . . . . . . . . . Sabri Altunkaya, Sadık Kara, Niyazi Görmüsß and Saadetdin Herdem

173

On the Implementation of Dependable Real-Time Systems with Non-Preemptive EDF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Michael Short

183

Towards Linking Islands of Information Within Construction Projects Utilizing RF Technologies . . . . . . . . . . . . . . . . . . . . . . . Javad Majrouhi Sardroud and Mukesh Limbachiy

197

Contents

18

A Case Study Analysis of an E-Business Security Negotiations Support Tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jason R. C. Nurse and Jane E. Sinclair

19

Smart Card Web Server. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lazaros Kyrillidis, Keith Mayes and Konstantinos Markantonakis

20

A Scalable Hardware Environment for Embedded Systems Education . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tiago Goncßalves, A. Espírito-Santo, B. J. F. Ribeiro and P. D. Gaspar

21

22

23

24

25

26

27

Yield Enhancement with a Novel Method in Design of Application-Specific Networks on Chips . . . . . . . . . . . . . . . . . Atena Roshan Fekr, Majid Janidarmian, Vahhab Samadi Bokharaei and Ahmad Khademzadeh

ix

209

221

233

247

On-Line Image Search Application Using Fast and Robust Color Indexing and Multi-Thread Processing . . . . . . . . . . . . . . . Wichian Premchaisawadi and Anucha Tungkatsathan

259

Topological Mapping Using Vision and a Sparse Distributed Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mateus Mendes, A. Paulo Coimbra and Manuel M. Crisóstomo

273

A Novel Approach for Combining Genetic and Simulated Annealing Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Younis R. Elhaddad and Omar Sallabi

285

Buyer Coalition Formation with Bundle of Items by Ant Colony Optimization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Anon Sukstrienwong

297

Coevolutionary Grammatical Evolution for Building Trading Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kamal Adamu and Steve Phelps

311

High Performance Computing Applied to the False Nearest Neighbors Method: Box-Assisted and kd-Tree Approaches . . . . . . . Julio J. Águila, Ismael Marín, Enrique Arias, María del Mar Artigao and Juan J. Miralles

323

x

28

29

30

31

32

33

34

35

36

37

38

Contents

Ethernet Based Implementation of a Periodic Real Time Distributed System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sahraoui Zakaria, Labed Abdennour and Serir Aomar

337

Preliminary Analysis of Flexible Pavement Performance Data Using Linear Mixed Effects Models. . . . . . . . . . . . . . . . . . . Hsiang-Wei Ker and Ying-Haur Lee

351

Chi-Squared, Yule’s Q and Likelihood Ratios in Tabular Audiology Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Muhammad Naveed Anwar, Michael P. Oakes and Ken McGarry

365

Optimising Order Splitting and Execution with Fuzzy Logic Momentum Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Abdalla Kablan and Wing Lon Ng

377

The Determination of a Dynamic Cut-Off Grade for the Mining Industry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . P. V. Johnson, G. W. Evatt, P. W. Duck and S. D. Howell

391

Improved Prediction of Financial Market Cycles with Artificial Neural Network and Markov Regime Switching . . . . . . . . David Liu and Lei Zhang

405

Fund of Hedge Funds Portfolio Optimisation Using a Global Optimisation Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bernard Minsky, M. Obradovic, Q. Tang and Rishi Thapar

419

Increasing the Sensitivity of Variability EWMA Control Charts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Saddam Akber Abbasi and Arden Miller

431

Assessing Response’s Bias, Quality of Predictions, and Robustness in Multiresponse Problems . . . . . . . . . . . . . . . . . . . . Nuno Costa, Zulema Lopes Pereira and Martín Tanco

445

Inspection Policies in Service of Fatigued Aircraft Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nicholas A. Nechval, Konstantin N. Nechval and Maris Purgailis

459

Toxicokinetic Analysis of Asymptomatic Hazard Profile of Welding Fumes and Gases . . . . . . . . . . . . . . . . . . . . . . . . . . . Joseph I. Achebo and Oviemuno Oghoore

473

Contents

39

40

41

42

43

44

45

46

47

48

xi

Classification and Measurement of Efficiency and Congestion of Supply Chains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mithun J. Sharma and Song Jin Yu

487

Comparison of Dry and Flood Turning in Terms of Dimensional Accuracy and Surface Finish of Turned Parts . . . . . . . . . . . . . . . Noor Hakim Rafai and Mohammad Nazrul Islam

501

Coordinated Control Methods of Waste Water Treatment Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Magdi S. Mahmoud

515

Identical Parallel-Machine Scheduling and Worker Assignment Problem Using Genetic Algorithms to Minimize Makespan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Imran Ali Chaudhry and Sultan Mahmood Dimensional Accuracy Achievable in Wire-Cut Electrical Discharge Machining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mohammad Nazrul Islam, Noor Hakim Rafai and Sarmilan Santhosam Subramanian

529

543

Nash Game-Theoretic Model for Optimizing Pricing and Inventory Policies in a Three-Level Supply Chain . . . . . . . . . . . . Yun Huang and George Q. Huang

555

Operating Schedule: Take into Account Unexpected Events in Case of a Disaster. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Issam Nouaouri, Jean Christophe Nicolas and Daniel Jolly

567

Dynamic Hoist Scheduling Problem on Real-Life Electroplating Production Line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Krzysztof Kujawski and Jerzy S´wia˛tek

581

Effect of HAART on CTL Mediated Immune Cells: An Optimal Control Theoretic Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . Priti Kumar Roy and Amar Nath Chatterjee

595

Design, Development and Validation of a Novel Mechanical Occlusion Device for Transcervical Sterilization . . . . . . . . . . . . . Muhammad Rehan, James Eugene Coleman and Abdul Ghani Olabi

609

xii

49

50

51

52

Contents

Investigation of Cell Adhesion, Contraction and Physical Restructuring on Shear Sensitive Liquid Crytals . . . . . . . . . . . . . Chin Fhong Soon, Mansour Youseffi, Nick Blagden and Morgan Denyer On the Current Densities for the Electrical Impedance Equation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Marco Pedro Ramirez Tachiquin, Jose de Jesus Gutierrez Cortes, Victor Daniel Sanchez Nava and Edgar Bernal Flores

623

637

Modelling of Diseased Tissue Diffuse Reflectance and Extraction of Optical Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shanthi Prince and S. Malarvizhi

649

Vertical Incidence Increases Virulence in Pathogens: A Model Based Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Priti Kumar Roy, Jayanta Mondal and Samrat Chatterjee

661

53

Chaotic Oscillations in Hodgkin–Huxley Neural Dynamics . . . . . . Mayur Sarangdhar and Chandrasekhar Kambhampati

54

Quantification of Similarity Using Amplitudes and Firing Times of a Hodgkin–Huxley Neural Response . . . . . . . . . . . . . . . Mayur Sarangdhar and Chandrasekhar Kambhampati

687

Reduction of HIV Infection that Includes a Delay with Cure Rate During Long Term Treatment: A Mathematical Study . . . . . . Priti Kumar Roy and Amar Nath Chatterjee

699

55

675

Chapter 1

Mathematical Modelling for Coal Fired Supercritical Power Plants and Model Parameter Identification Using Genetic Algorithms Omar Mohamed, Jihong Wang, Shen Guo, Jianlin Wei, Bushra Al-Duri, Junfu Lv and Qirui Gao

Abstract The paper presents the progress of our study of the whole process mathematical model for a supercritical coal-fired power plant. The modelling procedure is rooted from thermodynamic and engineering principles with reference to the previously published literatures. Model unknown parameters are identified using Genetic Algorithms (GAs) with 600MW supercritical power plant on-site measurement data. The identified parameters are verified with different sets of measured plant data. Although some assumptions are made in the modelling process to simplify the model structure at a certain level, the supercritical

O. Mohamed (&) J. Wang S. Guo J. Wei School of Electrical, Electronics, and Computer Engineering, University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK e-mail: [email protected] J. Wang e-mail: [email protected] S. Guo e-mail: [email protected] J. Wei e-mail: [email protected] B. Al-Duri School of Chemical Engineering, University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK e-mail: [email protected] J. Lv Q. Gao Department of Thermal Engineering, Tsinghua University, Beijing, People’s Republic of China e-mail: [email protected] Q. Gao e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_1, Springer Science+Business Media B.V. 2011

1

2

O. Mohamed et al.

coal-fired power plant model reported in the paper can represent the main features of the real plant once-through unit operation and the simulation results show that the main variation trends of the process have good agreement with the measured dynamic responses from the power plants. Nomenclature ff Fitness function for genetic algorithms ffr Pulverized fuel flow rate (kg/s) h Enthalpy per unit mass (MJ/kg) K Constant parameter k Mass flow rate gain m Mass (kg) m_ Mass flow rate (kg/s) P Pressure of a heat exchanger (MPa) Heat transfer rate (MJ/s) Q_ R Response T Temperature (C) t Time (s) s Time constant (s) U Internal energy (MJ) V Volume of fluid (m3) _ Work rate or power (MW) W x Generator reactance (p.u) y Output vector q Density (kg/m3) v Valve opening d Rotor angle (rad) h Mechanical angle (rad) x Speed (p.u) C Torque (p.u) Subscripts a Accelerating air Air e Electrical d Direct axis ec Economizer hp High pressure turbine hx Heat exchanger i Inlet ip Intermediate pressure turbine me Mechanical ms Main steam m Measured

1 Mathematical Modelling

o out q rh sh si ww

3

Outlet Output of the turbine Quadrature axis Reheater Superheater Simulated Waterwall

Abbreviations BMCR Boiler maximum continuous rate ECON Economizer GA Genetic algorithm HP High pressure HX Heat exchanger IP Intermediate pressure MS Main steam RH Reheater SC Supercritical SH Superheater WW Waterwall

1.1 Introduction The world is now facing the challenge of the issues from global warming and environment protection. On the other hand, the demand of electricity is growing rapidly due to economic growth and increases in population, especially in the developing countries, for example, China and India. With the consideration of environment and sustainable development in energy, renewable energy such as wind, solar, and tidal wave should be only resources to be explored in theory. But the growth in demand is also a heavy factor in energy equations so the renewable energy alone is unlikely able to generate sufficient electricity to fill the gap in the near future. Power generation using fossil fuels is inevitable, especially, coal fired power generation is found to be an unavoidable choice due to its huge capacity and flexibility in load following. As a well know fact, the conventional coal fired power plants have a huge environmental impact and lower energy conversion efficiencies. Any new coal fired power plants must be cleaner with more advanced and improved technologies. Apart from Carbon Capture and Storage, supercritical power plants might be the most suitable choice with consideration of the factors in environmental enhancement, higher energy efficiency and economic growth. However, there has

4

O. Mohamed et al.

been an issue to be addressed in its dynamic responses and performance in relation with conventional subcritical plants due to the difference in the process structure and energy storage drum [1]. The characteristics of supercritical plants require the considerable attention and investigation. Supercritical boilers have to be oncethrough type boilers because there is not distinction between water and steam phases in supercritical process so there is no drum to separate water steam mixture. Due to the absence of the drum, the once-through boilers have less stored energy and faster responses than the drum-type boilers. There are several advantages of supercritical power plants [2, 3] over traditional subcritical plants include: • • • •

Higher cycle efficiency (Up to 46%) and lower fuel consumption. Reduced CO2 emissions per unit power generation. Be fully integratable with CO2 capture technology. Fast load demand following (in relatively small load demand changes).

However, some concerns are also raised in terms of its dynamic responses with regards to the demand for dynamic response speed. This is mainly caused by its once-through structure, that is, there is no drum to store energy as a buffer to response rapid changes in load demand. The paper is to develop a mathematical model for the whole plant process to study dynamic responses aiming at answering the questions in dynamic response speed. From the literature survey, several models have been reported with emphasis on different aspects of the boiler characteristics. Studying the dynamic response and control system of once-through supercritical (SC) units can be traced back to 1958 when work was done on a time-based simulation for Eddystone I unit of Philadelphia Electric Company and the work was extended for simulation of Bull run SC generation unit later in 1966 [4]. Yutaka Suzuki et al. modelled a once through SC boiler in order to improve the control system of an existing supercritical oil-fired plant. The model was based on nonlinear partial differential equations, and the model was validated through simulation studies [5]. Wataro Shinohara et al. presented a simplified state space model for SC once through boiler-turbine system and designed a nonlinear controller [6]. Pressure node model description was introduced by Toshio Inoue et al. for power system frequency simulation studies [7]. Intelligent techniques contributions have yielded an excellent performance for modeling. Neural network has been introduced to model the SC power plant with sufficiently accurate results if they are trained with suitable data provided by operating unit [8]. However, neural network performances are unsatisfactory to simulate some emergency conditions of the plant because NN method depends entirely on the data used for the learning process, not on physical laws. Simulation of SC boilers may be achieved either theoretically based on physical laws or empirically based on experimental work. In this paper, the proposed mathematical model is based on thermodynamic principles and the model parameters are identified by using the data obtained from a 600MW SC power plant [9]. The simulation results show that the model is trustable to simulate the whole once-through mode of operation at a certain level of accuracy.

1 Mathematical Modelling

5

1.2 Mathematical Model of the Plant 1.2.1 Plant Description The unit of a once-through supercritical 600MW power plant is selected for the modelling study. The schematic view of the boiler is shown in Fig. 1.1. Water from the feedwater heater is heated in the economizer before entering the superheating stages through the waterwall. The superheater consists of three sections which are low temperature superheater, platen superheater, and final stage superheater. The main outlet steam temperature is about 571C at the steady state and a pressure is 25.5 MPa. There are 2 reheating sections in the boiler for reheating the steam exhausted from the high pressure turbine. The inlet temperature of the reheater is 309C and the outlet temperature is nearly 571C and average pressure is 4.16 MPa. The reheated steam is used to energize the intermediate pressure turbine. The mechanical power is generated through multi-stage turbines to provide an adequate expansion of the steam through the turbine and subsequently high thermal efficiency of the plant.

1.2.2 Assumptions Made for Modelling Assumptions are made to simplify the process which should be acceptable by plant engineers and sufficient to transfer the model from its complex physical model to lead to simple mathematical model for the research purpose. Some of these assumptions are usually adopted for modelling supercritical or subcritical boilers [10]. Modelling in the work reported in the paper, the following general assumptions are made: Fig. 1.1 Schematic view of the plant

6

O. Mohamed et al.

• Fluid properties are uniform at any cross section, and the fluid flow in the boiler tubes is one-phase flow. • In the heat exchanger, the pipes for each heat exchanger are lumped together to form one pipe. • Only one control volume is considered in the waterwall. • The dynamic behaviour of the air and gas pressure is neglected.

1.2.3 The Boiler Model 1.2.3.1 Heat Exchanger Model The various heat exchangers in the boiler are modelled by the principles of mass and energy balances. The sub-cooled water in the economizer is transferred directly to a supercritical steam through the waterwall without passing the evaporation status. The equations are converted in terms of the derivatives (or variation rates) pressure and temperature of the heat exchanger. The mass balance equation of the heat exchanger (control volume) is: dm ¼ m_ i m_ o dt

ð1:1Þ

For the constant effective volume, Eq. 1.1 will be: V

dq ¼ m_ i m_ o dt

The density is a differentiable function of two variables which can be the temperature and pressure inside the control volume, thus we have: oq dP oq dT ¼ m_ i m_ o V þ oPT dt oT P dt The energy balance equation: dUhx ¼ Q_ hx þ m_ i hi m_ o ho dt Also, dUhx oq dP oq dT oh dP oh dT þ ¼V h þq þ dt oPT dt oT P dt oPT dt oT P dt dP oq dP oq dT oh dP oh dT þ q V h þ þ V dt oPT dt oT P dt oPT dt oT P dt V

dP _ Qhx þ m_ i hi m_ o ho dt

ð1:2Þ

1 Mathematical Modelling

7

Combining (1.1) and (1.2) to get the pressure and temperature state derivatives, Q_ hx þm_ i Hi m_ o Ho P_ ¼ ð1:3Þ s T_ ¼ Cðm_ i m_ o Þ DP_

ð1:4Þ

! oh qoT hi h oq P

ð1:5Þ

! oh qoT ho h oq P

ð1:6Þ

Where: Hi ¼

oT P

Ho ¼

oT T

! oh oq oh qoPT :oT s ¼ V q oq P 1 oP T oT

ð1:7Þ

P

C¼

1

ð1:8Þ

oq oP T oq oT P

ð1:9Þ

oq V oT P

D¼

The temperature of the superheater is controlled by the attemperator. Therefore, the input mass flow rate to the superheater is the addition of the SC steam and the water spray from the attemperator. The amount of attemperator water spray is regulated by opening the spray valve which responds to a signal from the PI controller. This prevents the high temperature fluctuation and ensures maximum efficiency over a wide range of operation. 1.2.3.2 Fluid Flow The fluid flow in boiler tubes for one-phase flow is : pﬃﬃﬃﬃﬃﬃﬃ m_ ¼ k DP

ð1:10Þ

Equation 1.10 is the simplest mathematical expression for fluid flow in boiler tubes. The flow out from the reheater and main steam respectively are: Prh m_ rh ¼ K10 pﬃﬃﬃﬃﬃﬃvrh Trh

ð1:11Þ

Pms m_ ms ¼ K20 pﬃﬃﬃﬃﬃﬃﬃvms Tms

ð1:12Þ

The detailed derivation of (1.11) and (1.12) can be found in [11].

8

O. Mohamed et al.

1.2.4 Turbine/Generator Model 1.2.4.1 Turbine Model The turbine is modeled through energy balance equations and then is combined with the boiler model. The work done by high pressure and intermediate pressure turbines are: _ hp ¼ m_ ms ðhms hout Þ W

ð1:13Þ

_ ip ¼ m_ rh ðhrh hout Þ W

ð1:14Þ

The mechanical power of the plant: _ hp þ W _ ip Pme ¼ W

ð1:15Þ

Up to Eq. 1.14, the boiler-turbine unit is model in a set of combined equations and can be used for simulation if we assume that the generator is responding instantaneously. However, the dynamics of the turbines’ speeds and torques must be affected by the generator dynamics and injecting the mechanical power only into the generator model will not provide this interaction between the variables. To have a strong coupling between the variables in the models of the turbine-generator, torque equilibrium equations for the turbine model are added to the turbine model: x_ hp ¼

x_ ip ¼

1 Chp Dhp xhp KHI ðhhp hip Þ Mhp

ð1:16Þ

h_ hp ¼ xb ðxhp 1Þ ¼ ðxhp 1Þ

ð1:17Þ

1 Cip Dip xip þKHI ðhhp hip Þ KIG ðhhp hg Þ Mip

ð1:18Þ

h_ ip ¼ xb ðxip 1Þ ¼ ðxip 1Þ

ð1:19Þ

Note that, for two-pole machine: hg ¼ d

1.2.4.2 Generator Model The generator models are reported in a number of literatures; a third order nonlinear model is adopted in our work [12]: d_ ¼ Dx

ð1:20Þ

JDx_ ¼ Ca ¼ Cm Ce DDx

ð1:21Þ

1 Mathematical Modelling

1 0 0 E e x x FD d q d id 0 Tdo V 0 V2 1 1 Ce ðp:u) Pe ðp:u) 0 eq sin d þ sin 2d 2 xq x0d xd e_ 0q ¼

9

ð1:22Þ

ð1:23Þ

1.3 Model Parameter Identification 1.3.1 Identification Procedures The parameters of the model which are defined by the formulae from (1.3) to (1.7) and the other parameters of mass flow rates’ gains, heat transfer constants, turbine, and generator parameters are all identified by Genetic Algorithms in a sequential manner. Even though some of these parameters are inherently not constant, these parameters are fitted directly to the actual plant response to save time and effort. Various data sets of boiler responses have been chosen for identification and verification. First, the parameters of pressure derivatives equations are indentified. Then, the identification is extended to include the temperature equations, the turbine model parameters and finally generator model parameters. The measured responses which are chosen for identification and verification are: • • • • • • • •

Reheater pressure. Main SC steam pressure. Main SC steam temperature. Mass flow rate of SC steam from boiler main outlet to HP turbine. Mass flow rate of reheated steam from reheater outlet to the IP turbine. Turbine speed. Infinite bus frequency. Generated power of the plant.

In recent years, Genetic Algorithms optimization tool has been widely used for nonlinear system identification and optimization due to its many advantages over conventional mathematical optimization techniques. It has been proved that the GAs tool is a robust optimization method for parameters identification of subcritical boiler models [13]. Initially, the GAs produces random values for all the parameters to be identified and called the initial population. Then, it calculates the corresponding fitness function to recopy the best coded parameter in the next generation. The GAs termination criteria depend on the value of the fitness function. If the termination criterion is not met, the GA continues to perform the three main operations which are reproduction, crossover, and mutation. The fitness function for the proposed task is:

10

O. Mohamed et al.

ff ¼

N X

ðRm Rsi Þ2

ð1:24Þ

n¼1

The fitness function is the sum of the square of the difference between measured and simulated responses for each of the variables mentioned in this section. N is the number of points of the recorded measured data, The load-up and load-down data have been used for identification. The changes are from 30% to 100% of load and down to 55% to verify the model derived. The model is verified from a ramp load up data and steady state data to cover a large range of once-through operation. The model has been also verified by a third set of data. The GAs parameters setting for identification are listed below: Generation: 100 Population type: double vector Creation function: uniform Population size: 50–100 Mutation rate: 0.1 Mutation function: Gaussian Migration direction: forward Selection: stochastic uniform Figure 1.2 shows some of the load-up identification results. It has been observed that the measured and simulated responses are very well matched for the power generated and they are also reasonably matched for the temperature. Some parameters of the boiler model are listed in Table 1.1 and for heat transfer rates are listed in Table 1.2.

1.3.2 Model Parameter Verification The validation of the proposed model has been performed using a number of data sets which are the load down and steady state data. Figure 1.3 shows some of the simulated verification results (load-down and steady state simulation). From the results presented, it is obvious that the model response and the actual plant response are well agreed to each other.

1.4 Concluding Remarks A mathematical model for coal fired power generation with the supercritical boiler has been presented in the paper. The model is based on thermodynamic laws and engineering principles. The model parameters are identified using on-site operating

1 Mathematical Modelling

11 800

600

Power (MW)

Main Steam pressure (MPa)

30

20

10

400

200

0

0

100

200

300

400

0

500

0

100

200

300

400

500

400

500

Time(min)

Time(min) 580

600

Steam Flow(Kg/s)

Main Steam temperature (C)

560

540

520

400

200

500

480

0

100

200

300

400

Time(min)

0

500

0

100

Model response

200

300

Time(min)

Plant response

Fig. 1.2 Identification results

Table 1.1 Heat exchanger parameter

HX

Hi

Ho

C

D

ECON WW SH RH

10.2 12.2 20.5 19.8

13.6 13.3 45.9 22.0

2.1e-6 -1.2e-6 1e-6 -1e-6

-3.93 -0.1299 -3.73 -17.9

Table 1.2 Heat transfer rate

s1(s)

Kec

Kww

Ksh

Krh

9.3

5.7785

7.78

23.776

21.43

data recorded. The model is then verified by using different data sets and the simulation results show a good agreement between the measured and simulated data. For future work, the model will be combined with a nonlinear mathematical

12

O. Mohamed et al. 590

700

Power (MW)

Main steam temperature (C)

580

570

560

500

400

550 540

600

0

100

200

300 400

500

600

300

700

0

100

200

Time(min) 10

400

500

600

700

600

8

Steam flow(Kg/s)

Reheater pressure (MPa)

300

Time(min)

6

4

400

200

2

0

0

200

400

Time(min)

600

800

0

0

Plant response

200

400

600

800

Time(min)

Model response

Fig. 1.3 Verification results

model of coal mill to obtain a complete process mathematical model from coal preparation to electricity generation. It is expected that the mill local control system should have great contributions in enhancing the overall control of the plant. Acknowledgments The authors would like to give our thanks to E.ON Engineering for their support and engineering advices. The authors also want to thank EPSRC (RG/G062889/1) and ERD/AWM Birmingham Science City Energy Efficiency and Demand Reduction project for the research funding support.

References 1. Kundur P (1981) A survey of utility experiences with power plant response during partial load rejection and system disturbances. IEEE Trans Power Apparatus Syst PAS-100(5): 2471–2475 2. Laubli F, Fenton FH (1971) The flexibility of the supercritical boiler as a partner in power system design and operation: part I. IEEE Trans Power Apparatus Syst PAS-90(4): 1719–1724

1 Mathematical Modelling

13

3. Laubli F, Fenton FH (1971) The flexibility of the supercritical boiler as a partner in power system design and operation: part II. IEEE Trans Power Apparatus Syst PAS-90(4): 1725–1733 4. Littman B, Chen TS (1966) Simulation of bull-run supercritical generation unit. IEEE Trans Power Apparatus Syst 85:711–722 5. Suzuki Y, Sik P, Uchida Y (1979) Simulation of once-through supercritical boiler. Simulation 33:181–193 6. Shinohara W, Kotischek DE (1995) A simplified model based supercritical power plant controller. In: Proceeding of the 35th IEEE Conference on Decision and Control, vol 4, pp 4486–4491 7. Inoue T, Taniguchi H, Ikeguchi Y (2000) A Model of Fossil Fueled Plant with Once-through Boiler for Power System Frequency Simulation Studies. IEEE Trans Power Syst 15(4): 1322–1328 8. Lee KY, Hoe JS, Hoffman JA, Sung HK, Won HJ (2007) Neural network based modeling of large scale power plant. IEEE Power Engineering Society General Meeting No (24–28):1–8 9. Mohamed O, Wang J, Guo S, Al-Duri B, Wei J (2010) Modelling study of supercritical power plant and parameter identification using genetic algorithms. In: Proceedings of the World Congress on Engineering II, pp 973–978 10. Adams J, Clark DR, Luis JR, Spanbaur JP (1965) Mathematical modelling of once-through boiler dynamics. IEEE Trans Power Apparatus Syst 84(4):146–156 11. Salisbury JK (1950) Steam turbines & their cycles. Wiley, New York 12. Yu Y-N (1983) Electric power system dynamics. Academic Press, New York 13. Ghaffari A, Chaibakhsh A (2007) A simulated model for a once through boiler by parameter adjustment based on genetic algorithms. Simul Model Pract Theory 15:1029–1051

Chapter 2

Sequential State Computation Using Discrete Modeling Dumitru Topan and Lucian Mandache

Abstract In this paper we present a sequential computation method of the state vector, for pre-established time intervals or punctually. Based on discrete circuit models with direct or iterative companion diagrams, the proposed method is intended to a wide range of analog dynamic circuits: linear or nonlinear circuits with or without excess elements or magnetically coupled inductors. Feasibility, accessibility and advantages of applying this method are demonstrated by the enclosed example.

2.1 Introduction The discretization of the circuit elements, followed by corresponding companion diagrams, leads to discrete circuit models associated to the analyzed analog circuits [1–3]. Using the Euler, trapezoidal or Gear approximations [4, 5], simple discretized models are generated, whose implementation leads to an auxiliary active resistive network. In this manner, the numerical computation of desired dynamic quantities becomes easier and faster. Considering the time constants of the circuit, the discretization time step can be adjusted for reaching the solution optimally, in terms of precision and computation time. D. Topan (&) Faculty of Electrical Engineering, University of Craiova, 13 A.I. Cuza Str., Craiova, 200585, Romania e-mail: [email protected] L. Mandache Faculty of Electrical Engineering, University of Craiova, 107 Decebal Blv., Craiova, 200440, Romania e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_2, Ó Springer Science+Business Media B.V. 2011

15

16

D. Topan and L. Mandache

The discrete modeling of nonlinear circuits assumes an iterative process too, that requires updating the parameters of the companion diagram at each iteration and each integration time step [5, 6]. If nonzero initial conditions exist, they are computed usually through a steady state analysis performed prior to the transient analysis. The discrete modeling can be associated to the state variables approach [6, 7], as well as the modified nodal approach [5, 8], the analysis strategy being chosen in accordance with the circuit topology, the number of the energy storage circuit elements (capacitors and inductors) and the global size of the circuit. The known computation algorithms based on the discrete modeling allow the sequential computation, step by step, along the whole analysis time, of the state vector or output vector directly [5, 9, 10]. In this paper, one proposes a method that allows computing the state vector punctually, at the moments considered significant for the dynamic evolution of the circuit. Thus, the sequential computation for pre-established time subdomains is allowed.

2.2 Modeling Through Companion Diagrams The time domain analysis is performed for the time interval [t0, tf], bounded by the initial moment t0 and the final moment tf. It can be discretized with the constant time step h, chosen sufficiently small in order to allow using the Euler, trapezoidal or Gear numerical integration algorithms [1–5]. One can choose t0 = 0 and tf = wh, where w is a positive integer. The analog circuit analysis using discrete models requires replacing each circuit element through a proper model according to its constitutive equations. In this way, if the Euler approximation is used, the discretization equations and the corresponding discrete circuit models associated to the energy storage circuit elements are shown in Table 2.1, for the time interval ½nh; ðn þ 1Þh ; h\w. The tree capacitor voltages uC and the cotree inductor currents iL [7, 8] are chosen as state quantities, assembled in the state vector x. The currents IC of the tree capacitors and the voltages across the cotree inductors UL are complementary variables, assembled in the vector X. At the moment t ¼ nh, the above named vectors are partitioned as: n n u I xn ¼ nC ; Xn ¼ Cn ð2:1Þ iL UL with obvious significances of the vectors unC ; inL ; InC ; UnL . For the magnetically coupled inductors, the discretized equations and the companion diagram are shown in Table 2.1, where the following notations were used: L11 L12 L11 n L12 n ¼ ; Rnþ1 ; enþ1 i þ i; Rnþ1 11 ¼ 12 ¼ 1 h h h 1 h 2 ð2:2Þ L22 L21 L22 n L21 n nþ1 nþ1 ; R ; e i i ¼ ¼ ¼ þ : Rnþ1 22 21 2 h h h 2 h 1

Magnetically coupled inductor pair

Excess inductor

Cotree inductor

Excess capacitor

Tree capacitor

2

1

IL

iL

iC

IC

i2n +1

i1n+1

L22

L11

uL

L = 1/ Γ

UL

L = 1/ Γ

UC

C = 1/ S

uC

C = 1/ S

Table 2.1 Discrete modeling of the energy storage elements Element Symbol

U 2n+1

*

L12

*

U 1n+1

2'

L21

1'

¼

þ

hSICnþ1

1 hC

ILnþ1

n

þR22 inþ1 R22 in2 2

U2nþ1 ¼ R21 inþ1 R21 in1 1

R11 in1 R12 in2

ILn

¼ R11 inþ1 1 þR12 inþ1 2

¼

U1nþ1

unþ1 L

inþ1 ¼ inL þ hCULnþ1 L

unC

1 inþ1 ¼ hS UCnþ1 UC C

unþ1 C

Discretized expressions

2

1 i 2n +1

i1n +1

e 2n +1

R 22

R11

U 1n +1

U 2n +1

R 21 i1n +1

R12 i 2n +1

1 n IL hΓ

iLn

U Ln+1

1 n UC hS

U Cn +1

uCn

u Ln +1

hΓ

e1n +1

I Ln +1 1 / hΓ

i Ln

iCn +1

hS

u Cn+1

I Cn+1 hS

Companion diagram

2'

1'

2 Sequential State Computation Using Discrete Modeling 17

18

D. Topan and L. Mandache

Table 2.2 Iterative discrete modeling Element Iterative dynamic parameter i Rnþ1; m ¼ oo ui i¼inþ1; m

Companion diagram n +1, m n +1, m e i n +1, m +1 R

u

u = uˆ (i ) i

q

C nþ1; m ¼

Lnþ1; m ¼

u n +1, m +1

oq o u u¼unþ1; m

1 Rnþ1; m ¼ Lnþ1; m h

ou o i i¼inþ1; m

1 enþ1; m ¼ unþ1; m Lnþ1; m h inþ1; m

ϕ = ϕˆ (i ) Gnþ1; m ¼

o i

Snþ1; m ¼ u

q = qˆ (u ) i

ϕ

Cnþ1; m ¼ u

i = iˆ (ϕ )

unþ1; m

ou o q q¼qnþ1; m

oi o u u¼unþ1; m

Gnþ1; m ¼ Gnþ1; m jnþ1; m ¼ inþ1; m Gnþ1; m

i n+1, m +1

i = iˆ(u ) q

j n +1, m

o u u¼unþ1; m

u

i

Rnþ1; m ¼ hSnþ1; m enþ1; m ¼ unþ1; m hSnþ1; m

u

i

enþ1; m ¼ unþ1; m Rnþ1; m

inþ1; m

u = uˆ (q ) ϕ

Rnþ1; m ¼ Rnþ1; m inþ1; m

u

i

Notations in the companion diagram

G u

n +1, m

n +1, m +1

1 Gnþ1; m ¼ Cnþ1; m h 1 nþ1; m nþ1; m j ¼i Cnþ1; m h unþ1; m Gnþ1; m ¼ hCnþ1; m jnþ1; m ¼ inþ1; m hCnþ1; m unþ1; m

For nonlinear circuits, the state variable computation at the moment t ¼ ðn þ 1Þh requires an iterative process that converges towards the exact solution [4, 5]. A second upper index corresponds to the iteration order (see Table 2.2). Similar results to those of Tables 2.1 and 2.2 can be obtained using the trapezoidal [5, 11] or Gear integration rule [4, 5].

2.3 Sequential and Punctual State Computation The treatment with discretized models assumes substituting the circuit elements with companion diagrams, which consist in a resistive model diagram. It allows the sequential computation of the circuit solution.

2 Sequential State Computation Using Discrete Modeling

19

2.3.1 Circuits Without Excess Elements If the given circuit does not contain capacitor loops nor inductor cutsets [7, 8], the discretization expressions associated to the energy storage elements (Table 2.1, lines 1 and 3), using the notations (2.1), one obtains S 0 ð2:3Þ xnþ1 ¼ xn þ h Xnþ1 ; 0 C where S is the diagonal matrix of capacitor elastances and C is the matrix of inductor reciprocal inductances. Starting from the companion resistive diagram, the complementary variables are obtained as output quantities [5, 10, 11] of the circuit Xnþ1 ¼ E xn þ F unþ1 ;

ð2:4Þ

where E and F are transmittance matrices, and unþ1 is the vector of input quantities [7, 8] at the moment t ¼ ðn þ 1Þh. From (2.3) and (2.4) one obtains an equation that allows computing the state vector sequentially, starting from its initial value x0 ¼ xð0Þ until the final value xw ¼ xðwhÞ: xnþ1 ¼ M xn þ N unþ1 ;

ð2:5Þ

where M¼1þh

0 E; C

S 0

ð2:6Þ

1 being the identity matrix, and N¼h

S 0

0 F: C

ð2:7Þ

Starting from Eq. 2.5, through mathematical induction, the useful formula is obtained as xn ¼ Mn x0 þ

n X

Mnk N uk ;

ð2:8Þ

i¼1

where the upper indexes of the matrix M are integer power exponents. The formula (2.8) allows the punctual computation of the state vector at any moment t ¼ nh, if the initial conditions of the circuit and the excitation quantities are known. If a particular solution xp ðtÞ of the state equation exists, it significantly simplifies the computation of the general solution xðtÞ. Using the Euler numerical integration method, one obtains [5]:

20

D. Topan and L. Mandache

xnþ1 ¼ M xn xnp þ xnþ1 p :

ð2:9Þ

The sequentially computation of the state vector implies the priory construction of the matrix E, according to Eqs. 2.6 and 2.9. This action requires analyzing an auxiliary circuit obtained by setting all independent sources to zero in the given circuit. Starting from Eq. 2.9, the expression ð2:10Þ xn ¼ Mn x0 x0p þ xnp allows the punctual computation of the state vector.

2.3.2 Circuits with Excess Elements The excess capacitor voltages [8, 11], assembled in the vector UC , as well as the excess inductor currents [5, 7, 8], assembled in the vector IL , can be expressed in terms of the state variables and excitation quantities, at the moment t ¼ nh: n 0 UC K1 0 K1 0 n ¼ x un ; þ ð2:11Þ InL 0 K2 0 K02 0

0

where the matrices K1 ; K1 and K2 ; K2 contain voltage and current ratios respectively. Using the Table 2.1, the companion diagram associated to the analyzed circuit can be obtained, whence the complementary quantities are given by: n U ð2:12Þ Xnþ1 ¼ E xn þ E1 nC þ F un ; IL the matrices E; E1 and F containing transmittance coefficients. Considering Eqs. 2.11 and 2.12, the recurrence expression is obtained from (2.5), allowing the sequential computation of the state vector: xnþ1 ¼ M xn þ N unþ1 þ N1 un ;

ð2:13Þ

where M¼1þh S N¼h 0 K1 K¼ 0

S

0

ðE þ E1 KÞ; 0 C S 0 0 E1 K0 ; F; N1 ¼ h 0 C C 0 K1 0 0 ; K0 ¼ : K2 0 K02

ð2:14Þ

2 Sequential State Computation Using Discrete Modeling

21

If xp is a particular solution of the state equation, the following identity is obtained: N unþ1 þ N1 un ¼ xnþ1 M xnp ; p

ð2:15Þ

that allows converting (2.13) in the form (2.9), as common expression for any circuit (with or without excess elements).

2.4 Example In order to exemplify the above described algorithm, let us consider the transient response of the circuit shown in Fig. 2.1, caused by turning on the switch. The circuit parameters are: R1 ¼ R2 ¼ R3 ¼ 10 X ; L ¼ 10 mH; C ¼ 100 lF; E ¼ 10V; J ¼ 1A : The time-response of capacitor voltage and inductor current will be computed for the time interval t 2 ½0; 5 ms. These quantities are the state variables too. The corresponding discretized Euler companion diagram is shown in Fig. 2.2. According to the notations used in Sect. 2.2, we have: I E u x¼ C ; X¼ C ; u¼ J iL UL The computation way of the matrices E and F arises from the particular form of the expression (2.4):

ICnþ1 ULnþ1

¼

e11 e21

f12 E f22 J

n u e12 f nC þ 11 iL e22 f21

from where: ICnþ1 ICnþ1 ; e ¼ ; 12 unC in ¼0; E¼0; J¼0 inL un ¼0; E¼0; J¼0 C L ULnþ1 ULnþ1 ¼ n ; e22 ¼ n ; uC in ¼0; E¼0; J¼0 iL un ¼0; E¼0; J¼0

e11 ¼ e21

C

L

Using the diagram of Fig. 2.2, the elements of the matrices E and F were computed, assuming a constant time step h ¼ 0:1 ms: E¼

0:1729 0:7519

0:7519 ; 9:7740

F¼

0:0827 0:8270 0:0752 0:7519

The matrices M and N given by Eqs. 2.6, 2.7 are:

22

D. Topan and L. Mandache

Fig. 2.1 Circuit example

t= 0 iL R1

uC

C

R3

E

Fig. 2.2 Discretized diagram

R2

I Cn +1 h C

R1

J

L

R2

J

iLn +1

R3

uCn +1

L h

U Ln +1

uCn

E

iLn

0.7

7.5

6.5 6

h=0.1 ms h’=0.5 ms Exact sol.

5.5 5

iL [A]

u C [V]

7

0

1

2

3

4

5

0.6

h=0.1 ms h’=0.5 ms Exact sol. 0.5 0

1

2

Time [ms]

3

4

5

Time [ms]

Fig. 2.3 Circuit response

M¼

1

0

0

1

"

þ 0:1 10

3

0

0

1 10103

" N ¼ 0:1 10

3

#

1 100106

1 100106

0

0

1 10103

E¼ #

0:8271 0:7519 0:0075

0:0827 F¼ 0:0008

0:9023

;

0:8270 : 0:0075

Starting from the obvious initial condition 0 5V u ; x0 ¼ 0C ¼ 0:5 A iL the solutions were computed using (2.8) and represented in Fig. 2.3 with solid line.

2 Sequential State Computation Using Discrete Modeling

23

The calculus was repeated in the same manner for a longer time step, 0 h ¼ 5h ¼ 0:5 ms, the solution being shown in the same figure. Both computed solutions are referred to the exact solution represented with thin dashed line.

2.5 Conclusion The proposed analysis strategy and computation formulae allow not only the punctual computation of the state vector, but also allow crossing the integration subdomains with variable time step. The proposed method harmonizes naturally with any procedure based on discrete models of analog circuits, including the methods for iterative computation of nonlinear dynamic networks. The versatility of the method has already allowed an extension, in connection to the modified nodal approach. Acknowledgments This work was supported in part by the Romanian Ministry of Education, Research and Innovation under Grant PCE 539/2008.

References 1. Topan D, Mandache L (2010) Punctual state computation using discrete modeling. Lecture notes in engineering and computer science. In: Proceedings of the world congress on engineering, vol 2184, London, June 30–July 2 2010, pp 824–828 2. Henderson A (1990) Electrical networks. Edward Arnold, London, pp 319–325 3. Topan D (1978) Computerunterstutze Berechnung von Netzwerken mit zeitdiskretisierten linearisierten Modellen. Wiss. Zeitschr. T.H. Ilmenau, pp 99–107 4. Gear C (1971) The automatic integration of ordinary differential equations. ACM 14(3):314–322 5. Topan D, Mandache L (2007) Chestiuni speciale de analiza circuitelor electrice. Universitaria, Craiova, pp 115–143 6. Topan D (1995) Iterative models of nonlinear circuits. Ann Univ Craiova Electrotech 19:44–48 7. Rohrer RA (1970) Circuit theory: an introduction to the state variable approach. Mc GrawHill, New York, pp 3–4 8. Chua LO, Lin PM (1975) Computer-aided analysis of electronic circuits–algorithms and computational techniques. Prentice-Hall, Englewood Cliffs, Chaps. 8–9 9. Chen W-K (1991) Active network analysis. World Scientific, Singapore, pp 465–470 10. Opal A (1996) Sampled data simulation of linear and nonlinear circuits. IEEE Trans Computer-Aided Des Integr Circuits Syst 15(3):295–307 11. Boite R, Neirynck J (1996) Traité d0 Electricité, vol IV: Théorie des Réseaux de Kirchhoff. Presses Polytechniques et Universitaires Romandes, Lausanne, pp 146–158

Chapter 3

Detection and Location of Acoustic and Electric Signals from Partial Discharges with an Adaptative Wavelet-Filter Denoising Jesus Rubio-Serrano, Julio E. Posada and Jose A. Garcia-Souto

Abstract The objective of this research work is the design and implementation of a post-processing algorithm or ‘‘search and localization engine’’ that will be used for the characterization of partial discharges (PD) and the location of the source in order to assess the condition of paper-oil insulation systems. The PD is measured with two acoustic sensors (ultrasonic PZT) and one electric sensor (HF ferrite). The acquired signals are conditioned with an adaptative wavelet-filter which is configured with only one parameter.

3.1 Introduction The degraded insulation is a main problem of the power equipment. The reliability of power plants can be improved by a preventive maintenance based on the condition assessment of the electrical insulation within the equipments. The insulation is degraded during the period in service due to the accumulation of mechanical, thermal and electric stresses. Partial discharges (PD) are stochastic electric phenomena that cause a large amount of small shortcoming (\500 pC) inside the insulation [1–3]. J. Rubio-Serrano (&) J. E. Posada J. A. Garcia-Souto GOTL, Department of Electronic Technology, Carlos III University of Madrid, c/Butarque 15, 28911, Leganes, Madrid, Spain e-mail: [email protected] J. E. Posada e-mail: [email protected] J. A. Garcia-Souto e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_3, Springer Science+Business Media B.V. 2011

25

26

J. Rubio-Serrano et al.

PD are present in the transformers due to the gas dissolved in the oil, the humidity and other faults. They become a problem when PD activity is persistent in time or in a localized area. These are signs of an imminent failure of the power equipment. Thus, the detection, the identification [4] and the localization of PD sources are important tools of diagnosis. This paper deals with the design of the algorithm that processes the time-series and performs the statistical analysis of the signals acquired in the framework of the MEDEPA test bench in order to assess the insulation faults. This set-up is an experimental PD generation and measurement system designed in the University Carlos III of Madrid in order to study and develop electrical and ultrasonic sensors [5] and analysis techniques, which allow the characterization and the localization of PD. A PD is an electrical fast transient which produces a localized acoustic emission (AE) due to thermal expansion of the dielectric material [3]. It also generates chemical changes, light emission, etc. [6, 7]. In this work acoustic and electrical signals are processed together. AE is characterized and both methods of detection are put together to assess the activity of PD. The electro-acoustic conversion ratio of PD can be explored by these means [8].

3.2 Experimental Set-Up The measurements are taken from the MEDEPA experimental set-up. It has the following blocks to generate different types of PD and acquire the signals (up to 100 MSps) from different sensors: 1. PD generation the experimental set-up generates controlled PD from a highvoltage AC excitation that is reliable for the ultrasonic sensor characterization and the acoustic measurements. 2. Instrumentation for electrical measurement the calibrated electrical measurement allows the correlation of generated PD and provides their basic characteristics (charge, instant of time, etc.). 3. Instrumentation for acoustic measurements ultrasonic PZT detectors are used for measuring the AE outside the tank. Fiber-optic sensors are being developed for measurements inside [9]. The experimental set-up is an oil-filled tank with immersed electrodes that generate PD. The ultrasonic sensors (R15i, 150 kHz, *1 V/Pa) are externally mounted on the tank walls. A wide-band ferrite (10 MHz) is used for electrical measurements and additional instrumentation (Techimp) provides electrical PD analysis. AE travels through the oil (1.5 mm/ls) and the PMMA wall (2.8 mm/ls) to several ultrasonic PZT sensors. The mechanical and acoustic set-up is represented in Fig. 3.1a. The internal PD generator consists of two cylindrical electrodes of 6 cm of diameter that are separated by several isolating paper layers. High-voltage AC at 50 Hz is applied

3 Detection and Location of Acoustic and Electric Signals

27

Fig. 3.1 Experimental set-up for acoustic detection and location (a). PD single event observation: electric signal and acoustic signals from sensors 1 and 2 (b)

between 4.3 and 8.7 kV, so PD are about 100 pC. The expected signals from a PD are as shown in Fig. 3.1b: a single electric signal and an acoustic signal for each channel. The delay is calculated between the electric and acoustic signals to locate spatially the PD source. Each PD single event produces an electric charge displacement of short-duration (1 ls) that is far shorter than the detected acoustic burst. The electric pulses are detected in the generation circuit. The AE signals are detected in front of the electrodes at the same height on two different walls of the tank. Several sensors are used to obtain the localization of the PD source and the electro-acoustic identification of the PD.

28

J. Rubio-Serrano et al.

Fig. 3.2 PD electro-acoustic pattern: electric pattern, acoustic patterns of channels 1 and 2

3.3 Signals Characteristics The detection of PD by electro-acoustic means has the following difficulties: the stochastic process of PD generation and the detection limits of electric and acoustic transients (signal level, identification and matching). The signals are acquired without any external synchronization due to their stochastic generation. A threshold with the AE signals is setting for assuring at least one AE detection. Afterwards, the time series are analyzed without any reference to the number of PD signals or their time-stamps. AE signals are necessary for the PD spatial location, but they are often less in number than the electric signals due to their strong attenuation caused by the propagation through the oil and the obstacles in the acoustic path. The AE detection in the experiment has the following characteristics specifically: amplitude usually below 10 mV and signal distortion due to the acoustic propagation path from the PD source to the PZT sensor. In addition, the acoustic angle of incidence to the sensor on the wall produces internal reflection and reverberation. These effects modify the shape, the energy and the power spectrum of the received signal, thus an AE from a single PD is detected differently depending on the position of the sensor. Figure 3.2 shows the characteristic transient waves at each sensor (electric and acoustic) that are associated to a single PD event. This is the electro-acoustic pattern. Though the AE signals are from the same PD their characteristics are different.

3 Detection and Location of Acoustic and Electric Signals

29

Electric signals are easily detected in this experiment. They are used as a zero time reference to calculate the acoustic time of flight from the PD source to the AE sensor. Thus, electro-acoustic processing is performed on the base of pairing the signals from different sensors and sensor types. Electric and AE signals show diverse duration: 1 ls (electric), 100 ls (AE). Multiple PD from the same or different sources can be generated in the time duration of an AE signal, so the detected AE signal can be the result of the acoustic interference of several PD events. In addition, each AE signal can be associated with more than one electric signal by using time criteria. First approach deals with a processing of the different signals independently and the statistical analysis to link them together and identify PD events [10]. In addition, an all-acoustic system of four or more channels is ongoing to locate PD events upon the basis of a multichannel processing.

3.4 Signal Processing The main objective of the algorithm is to analyze the time series in order to detect and evaluate statistically the PD activity and its characteristics: PMCC and energy of PD events, energy ratio between channels and delays. The selected processing techniques meet the following requirements [10]: (a) same processing regardless the characteristics of the signal, (b) accurate timestamp of the detected signals, (c) detection based on the shape and the energy, (d) identification tools and (e) statistical analysis for signals pairing. The signal processing is done with the following structure: pattern selection, wavelet filtering, acoustic detection, electro-acoustic pairing, PD event identification and PD localization.

3.4.1 Pattern Selection A model of PD is selected form the measurement of a single event (Fig. 3.1b). It is selected by one of these means: (a) technician’s observation of a set of signals repetitively with an expected delay, (b) the set of transients selected by amplitude criteria in each channel and (c) a previously stored PD that is useful to study the aging of the insulation. The PD pattern is the set of selected transient waves (Fig. 3.2).

3.4.2 Wavelet Filtering Signal denoising is performed by wavelet filtering that preserves the time and shape characteristics of the original PD signals [11, 12]. In addition, wavelet filters

30

J. Rubio-Serrano et al.

are self-configurable for different kind of signals by using an automatic selection rule that extract the main characteristics [13, 14]. The basic steps of the wavelet-filtering are the following: (a) transformation of the signal into the wavelet space, (b) thresholding of the wavelet components (all coefficients smaller than a certain threshold are set to zero) and (c) reverse transformation of the non-zero components. As a result the signal is obtained without undesired noise. The wavelet-transform is considered two-dimensional: in time and in scale or level of the wavelet. Each level is associated to particular frequency bands. After the n-level transformation the signal in the wavelet-space is a sum of wavelet decompositions (D) and approximations (A): n X Di ð3:1Þ signal ¼ An þ i¼1

This tool is used in combination with the Pearson product-moment correlation coefficient (PMCC or ratio) and the energy from the cross-correlation to identify which indices of Di have the main information of the pattern. The same indices are used to configure the filter that is applied to the acquired signals. PMCC is a statistical index that measures the linear dependence between two vectors X and Y. It is independent of the signal’s energy so it is used to compare the wave shape of two signals with the same length, although they were out of phase. It is defined by (3.2). Pn ðxi xÞ ðyi yÞ ð3:2Þ PMCC ¼ r ¼ Pn i¼1 P xÞ2 ni¼1 ðyi yÞ2 i¼1 ðxi The flowchart of the wavelet filtering is shown in Fig. 3.3. It is remarkable that the pattern for each channel is processed only one time. Afterwards the reconstructed signal (PATTERNw) and the configuration of the filter are obtained. Each and every acquisition is configured with these parameters. The filtered patterns (PATTERNw) and the filtered signals (SIGNALw) are the sum of their respective selected decompositions. First the pattern of each channel is filtered and conditioned and then each and every acquisition is individually processed. This wavelet filter has the following advantages: • It does not distort the waveform of the signals, so the temporal information is conserved. This is important for cross-correlation and PMCC. • It does not delay the signal. It is important for time of flight calculation. • It is self-configurable. Once the threshold is setting, the algorithm selects the decompositions that have the main frequencies of the signals.

3.4.3 Acoustic Detection Each acoustic acquisition is compared with the acoustic pattern through the crosscorrelation. Cross-correlation is used as a measure of the similarity between two signals. Moreover, the time location of each local peak matches with the starting

3 Detection and Location of Acoustic and Electric Signals

31

START

WAVELET CONFIG PARAMETERS : Daubechies ‘db 20’ n decompositions

PATTERN

WAVELET DECOMPOSITION

D1

D1'

D2

...

Dn

SIGNALS

WAVELET DECOMPOSITION

D1

An

D2

...

Dn

An

PATTERN = D1 + D2 + … + Dn + An

SIGNAL = D1 + D2 + … + Dn + An

PARAMETER ORDENATION CRITERIA

SIGNAL RECONSTRUCTION

D2'

...

Dn’

An

SIGNALw = ΣDi' (where i’ = 1' … m’)

ADD PARAMETER : ONE DECOMPOSITION FROM D 1' TO Dn’

PATTERNw = D1' + D2' + ...

END

NO

¿PATTERNw MEET PARAMETER CONDITIONS?

YES at order m’

FILTER DECOMPOSITION CONFIGURATION : i’ = 1' … m’

PATTERN RECONSTRUCTION PATTERNw = ΣDi' (where i’ = 1' … m’)

END

Fig. 3.3 Wavelet filtering flowchart

instant of a transient similar to the pattern. The value of the peak is also a good estimator of the similarity of the signals. Cross-correlation is used as a search engine to detect the transients that are the best candidates of coming from a PD. It is also used to associate the time-stamp to each one.

32

J. Rubio-Serrano et al.

The algorithm analyzes the peaks of the cross-correlation in order to decide if the detected transient satisfies the minimum requirements of the selected parameters (energy, amplitude, PMCC, etc.). A maximum of four transients per acoustic acquisition are stored for statistical analysis.

3.4.4 Electro-Acoustic Signal Association Next step is the cross-correlation of the electric signals with the pattern. Though it is based on the same tool, some differences are introduced. In this case, the maximum absolute value of the cross-correlation is searched. Positive and negative peaks of the cross-correlation are detected and they are associated to the instantaneous phase of the power line voltage, which is an additional parameter for identification. The electric signals are searched in a temporal window that is compatible with the detected acoustic signal (3.3). Thus, the search within the electric acquisition is delimited between the time-stamp of the acoustic signal and a time period before. This temporal window corresponds with the time that the AE takes to cross the tank. In the experiment of Fig. 3.1 the length of the temporal window is 350 ls by considering *1.5 mm/ls of sound-speed in oil and 500 mm of the length of the tank. dist:tank ; tstart ðacous sigÞ ð3:3Þ tstart ðelec sigÞ 2 tstart ðacous sigÞ vsound Each acoustic signal is matched with up to four electrical signals that satisfy Eq. 3.3 and the database of PD parameters is obtained. Afterwards, the presentation tool provides the histogram of the delay between paired signals in order to analyze the persistency of some delay values. These values with higher incidence correspond to a fault in the insulation. This process of acoustic detection and electro-acoustic signal association is implemented separately for each acoustic channel. The data obtained for each acoustic channel is independent from the others in this approach.

3.4.5 PD Event Association The association of the transients detected in the acoustic and electric channels provides sets of related signals that come from single PD events with certain probability. Hence, each PD event is defined with three signals: the electric signals that are associated to both acoustic channels and the corresponding acoustic signals. As a result, each PD event contains an electric time-stamp that is the zero time reference and the time of flight of each acoustic signal. These parameters and the references of association are stored in a database as structured information that is used for the statistical analysis.

3 Detection and Location of Acoustic and Electric Signals

33

3.4.6 Localization of the PD Events Once the database is generated all the PD events are analyzed in order to assess the condition of the insulation. The fault inside the insulation is identified by the persistency of PD events and located acoustically. The localization is made in the plane which contains the acoustic sensors and the paper between electrodes. PD are generated in this region. Reduced to this 2-D case, Eq. 3.4 is used as a simple localization tool. ðxPD xS1 Þ2 þ ðyPD yS1 Þ2 ¼ ðvsound TS1 Þ2 ðxPD xS2 Þ2 þ ðyPD yS2 Þ2 ¼ ðvsound TS2 Þ2

ð3:4Þ

Where (xS1, yS1) y (xS2, yS2) are the coordinates of sensors 1 and 2, respectively, and TS1 and TS2 are the time of flight of the acoustic signals from sensors 1 and 2, respectively. Equation 3.4 represents the intersection of two circumferences whose centers are located in the position of sensors 1 and 2. When all the PD events are localized and represented, the cluster of PD from the same region is statistically studied in order to find the parameters dependence between acoustic and electric measurements. These PD were probably generated in the same insulation fault so their acoustic path, attenuation and other variables involved in the acoustic detection should be identical. The PD events of the same cluster are analyzed against the lonely PD events. This study delimitates the range of values for a valid PD event. The persistency and the concentration of PD activity are the symptoms of the degradation of the insulation system. Hence, thought lonely PD events can be valid, they are no relevant for the detection of faults inside the insulation.

3.5 Experimental Results The proposed algorithm was applied to process the acquisitions that were taken on MEPEPA test-bench (Fig. 3.1a). In this experiment 76 series of acoustic and electric signals were acquired simultaneously. Each time series is approximately 8 ms and it is sampled at 100 MSps. First, the electric and acoustic patterns were selected from an isolated PD event (Fig. 3.1b) and they are filtered with the wavelet processing (Fig. 3.3). The electric pattern is a fast transient of about 7 MHz and its duration is 1 ls approximately. The length of the acoustic pattern is 35 ls and its central frequency is 150 kHz. A detail of the signals involved in the wavelet filtering to obtain the acoustic pattern of one sensor is shown in Fig. 3.4. A limitation of the selected pattern is the reverberation of the acoustic waves that is detected through the wall. For normal incidence of the acoustic signal on the PMMA wall (sound velocity of 2.8 mm/ls) the reflection takes 7 ls to reach the

34

J. Rubio-Serrano et al.

Fig. 3.4 Acoustic pattern of sensor 1 after reconstruction and its decomposition

detector again (20 mm). Thus the distortion of the acoustic signal is observed from 7 ls onwards. Once the patterns are selected and filtered each and every acquired signal series is filtered, it is processed with the cross correlation and analyzed with the PMCC. As a result the local peaks of the cross-correlation give the time-stamps that can be associated to PD events. These events are also characterized by their indexes and the transient waveforms that were found in the time series (Fig. 3.5).

Fig. 3.5 Electric and acoustic time-series. Sets of transient signals found as probable PD events

3 Detection and Location of Acoustic and Electric Signals

35

Fig. 3.6 Example of detected PD event (details of AE signal in sensor 1 and electric signal)

Acoustic and electric signals are matched together and the parameters of each signal are calculated. In Fig. 3.6 there is an example of one of the paired-signals (acoustic sensor 1 and electric sensor 3) found by the algorithm. The parameters of each transient signal and of the pair are also shown. Now, signals can be classified by their delays. In the experiment, there are some valid values with an incidence of four or more. The delay of maximum incidence is 102 ls (Fig. 3.7) for sensor 1 and 62 ls for sensor 2. In future work, it will be examined the relation between energy and the PMCC as a function of their location. The goal is to find and discriminate PD not only for its location but also for its expected values of energy and PMCC. Finally, Eq. 3.4 is employed to locate the origin of the acoustic signals in the plane (Fig. 3.8). It is important to emphasize that lonely PD events are observed and they can be valid events. However, they are no relevant for the detection of faults inside the insulation because their low persistency and their location are not characteristic for the insulation condition assessment. Figure 3.8 shows the existence of a region inside the electrodes where PD were frequently generated. This region represents a damaged area in the insulation. The

36

J. Rubio-Serrano et al.

Fig. 3.7 Histogram of delays obtained from the acoustic sensor 1 with the selected PD

Fig. 3.8 Location of the detected PD events in the plane

concentration of PD in this region is a symptom of an imminent failure of the electrical system.

3.6 Conclusions and Future Work The design and implementation of a post-processing algorithm is presented for the detection and location of partial discharges and the condition assessment of the insulation. The algorithm is able to parameterize the signals, to define the ranges and to delimitate the time windows in order to locate and classify PD in transformers. It was applied to signals from internal PD that were acquired by external

3 Detection and Location of Acoustic and Electric Signals

37

acoustic sensors, but it is being extended to superficial PD and internal acoustic fiber-optic sensors. The purpose of this signal processing within the framework of the MEDEPA test-bed is to locate, identify and parameterize PD activity to predict imminent failures in insulation systems. The main features of the proposed algorithm are the following: its feasibility to detect and identify PD signals from different sensors, the adaptability of the wavelet filtering based on an external pattern, and the multi-sensor statistical analysis instead of a single event approach. In addition, the wavelet denoising does not alter the temporal characteristics. Although the algorithm is not still designed for real-time use, after the temporal series are processed, their parameters are stored in a database, which is used as a reference of PD activity for further studies and extended to the maintenance of transformers in service. In order to improve the signal detection and identification, the next step is the calibration of the tool with different kind of PD activities (types, intensities and sources) and the statistical analysis of the parameters in the database. In addition to the time windowing and the pattern-matching, other parameters will be considered to assess the probability of PD: persistency, 3-D location, energy and PMCC. The location of PD sources is of main concern in the application of AE. The objective is to implement a 3-D algorithm compatible with the designed tools, either with external sensors, or using also internal sensors [9]. An all-acoustic system of four or more channels is ongoing to locate PD events upon the basis of a multi-channel processing. Finally, the electro-acoustic conversion ratio of PD activity is an open research line with the implemented statistical analysis. Acknowledgments This work was supported by the Spanish Ministry of Science and Innovation, under the R&D projects No. DPI2006-15625-C03-01 and DPI2009-14628-C03-01 and the Research grant No. BES-2007-17322. PD tests have been made in collaboration with the High Voltage Research and Tests Laboratory of Universidad Carlos III de Madrid (LINEALT).

References 1. Bartnikas R (2002) Partial discharges: their mechanism, detection and measurement. IEEE Trans Dielectr Electr Insul 9(5):763–808 2. Van Brunt RJ (1991) Stochastic properties of partial discharges phenomena. IEEE Trans Electr Insul 26(5):902–948 3. Lundgaard LE (1992) Partial discharge—part XIV: acoustic partial discharge detection— practical application. IEEE Electr Insul Mag 8(5):34–43 4. Suresh SDR, Usa S (2010) Cluster classification of partial discharges in oil-impregnated paper insulation. Adv Electr Comput Eng J 10(5):90–93 5. Macia-Sanahuja C, Lamela H, Rubio J, Gallego D, Posada JE, Garcia-Souto JA (2008) Acoustic detection of partial discharges with an optical fiber interferometric sensor. In: IMEKO TC 2 Symposium on Photonics in Measurements 6. IEEE Guide for the Detection and Location of Acoustic Emissions from Partial Discharges in Oil-Immersed Power Transformers and Reactors (2007). IEEE Power Engineering Society

38

J. Rubio-Serrano et al.

7. Santosh Kumar A, Gupta RP, Udayakumar K, Venkatasami A (2008) Online partial discharge detection and location techniques for condition monitoring of power transformers: a review. In: International Conference on Condition Monitoring and Diagnosis, Beijing, China, 21–24 April 2008 8. von Glahn P, Stricklett KL, Van Brunt RJ, Cheim LAV (1996) Correlations between electrical and acoustic detection of partial discharge in liquids and implications for continuous data recording. Electr Insul 1:69–74 9. Garcia-Souto JA, Posada JE, Rubio-Serrano J (2010) All-fiber intrinsic sensor of partial discharge acoustic emission with electronic resonance at 150 kHz. SPIE Proc 7726:7 10. Rubio-Serrano J, Posada JE, Garcia-Souto JA (2010) Digital signal processing for the detection and location of acoustic and electric signals from partial discharges. In: Proceedings of the World Congress on Engineering 2010, WCE 2010, 30 June–2 July 2010, London, UK, vol 2184, issue 1, pp 967–972 11. Ma X, Zhou C, Kemp IJ (2002) Automated wavelet selection and thresholding for PD detection. IEEE Electr Insul Mag 18(2):37–45 12. Keppel G, Zedeck S (1989) Data analysis for research designs—analysis of variance and multiple regression/correlation approaches. Freeman, New York 13. Wang K-C (2009) Wavelet-based speech enhancement using time-frequency adaptation. EURASIP J Adv Signal Process 2009:8 14. Pinle Q, Yan L, Ming C (2008) Empirical mode decomposition method based on wavelet with translation invariance. EURASIP J Adv Signal Process 2008:6

Chapter 4

Study on a Wind Turbine in Hybrid Connection with a Energy Storage System Hao Sun, Jihong Wang, Shen Guo and Xing Luo

Abstract Wind energy has been focused as an inexhaustible and abundant energy source for electrical power generation and its penetration level has increased dramatically worldwide in recent years. However, its intermittence nature is still a universally faced challenge. As a possible solution, energy storage technology hybrid with renewable power generation process is considered as one of options in recent years. The paper aims to study and compare two feasible energy storage means—compressed air (CAES) and electrochemical energy storage (ECES) for wind power generation applications. A novel CAES structure in hybrid connection with a small power scale wind turbine is proposed. The mathematical model for the hybrid wind turbine system is developed and the simulation study of system dynamics is given. Also, a pneumatic power compensation control strategy is reported to achieve acceptable power output quality and smooth mechanical connection transition.

H. Sun J. Wang (&) S. Guo X. Luo School of Electronic, Electrical and Computer Engineering, University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK e-mail: [email protected] H. Sun e-mail: [email protected] S. Guo e-mail: [email protected] X. Luo e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_4, Ó Springer Science+Business Media B.V. 2011

39

40

H. Sun et al.

4.1 Introduction Nowadays, the world is facing the challenge to meet the continuously increasing energy demand and to reduce the harmful impact to our environment. In particular, wind energy appears as a preferable solution to take a considerable portion of the generation market, especially in the UK. However, the key challenge faced by wind power generations is intermittency. The variability of wind power can lead to changes in power output from hour to hour, which arises from changes in wind speed. Figure 4.1 shows that the power output from a diversified wind power system is usually changing hourly from 5 to 20%, either higher or lower [1]. Besides, energy regulatory policies all around the world have been characterized by introducing competition in the power industry and market, both at the wholesale and at the retail levels. The variable market brought uncertain variations onto power transmission and distribution networks, which have been studied at length [2, 3]. It is highly desired to alleviate such impacts through alternative technologies. One proposed solution is to introduce an element of storage or an alternative supply for use when the ambient flux is insufficient for a guaranteed supply to the demand. The primary cause is that energy storage can make wind power available when it is most demanded. Apart from the pumped water, battery, hydrogen and super-capacitors, compressed air energy storage (CAES) is also a well known controllable and affordable technology of energy storage [4–6]. In a CAES system, the excess power is used to compress air which can be stored in a vessel or a cavern. The energy stored in compressed air can be used to generate electricity when required. Compared with other types of energy storage schemes, CAES is sustainable and will not produce any chemical waste. In this paper, a comparative analysis between CAES and electrochemical energy storage (ECES) has been conducted. A hybrid energy storage wind turbine system is proposed in the paper, which connects a typical wind turbine and vane-type air motor for compressed air energy conversion. The mathematical model for the whole system is derived and simulation study is conducted. The study of such a CAES system has shown a promising merit provided by the proposed hybrid connection of wind turbines and CAES. Fig. 4.1 The hourly change of wind power output

4 Study on a Wind Turbine in Hybrid Connection

41

4.2 Electrochemical and Compressed Air Energy Storage In this paper, the feasibility of energy storage for 2 kW household small scale wind turbine is analyzed. Electrochemical energy storage is the most popular type of energy storage in the world from small to large scales. For instance, the lead-acid battery is the oldest rechargeable battery with widest range of applications, which is a mature and cost-effective choice among all the electrochemical batteries. The main advantages of ECES are no emission, simple operation and higher energy efficiency. The efficiency of lead-acid batteries is generally around 80%. While, the compressed air energy storage is also cleaner as no chemical disposal pollution is produced to environment [7]. However, CAES has rather lower energy efficiency; much energy is lost during the process of thermal energy conversion [8–10]. A drawback ECES faced is relatively short lifetime that mainly expressed on the limited charge/discharge cycle life. For example, lead-acid batteries’ cycle life is roughly in the range of 500–1500. This issue can be more serious when it is applied to wind power generation due to the high variation in wind speed and low predictability to the wind power variation patterns, that is, the battery will be frequently charged and discharged. For CAES, the pneumatic actuators, including compressor, air motor, tank, pipes and valves, are relatively robust; the major components have up to 50-year lifetime. Therefore, the whole system lift time would be only determined by the majority of the mechanical components in the system. The capacity of an electrochemical battery is directly related to the active material in the battery. That means the more energy the battery can offer, the more active material will be contained in the battery, and therefore the size, weight as well as the price is almost linear versus the battery capacity. For the compressed air system, the capacity correlates to the volume of the air storage tank. Even though the pneumatic system also requires large space to sustain a long term operation, but it has been proven more cost-effective in consideration of the practically free raw material (see Table 4.1 [11]). The electromotive force of a lead-acid cell provides only about two Volts voltage due to its electrochemical characteristics, and enormous amount of cells therefore should be connected in series to obtain a higher terminal voltage. With this series connection, if one cell within the battery system goes wrong, the whole battery may fail to store or offer energy in the manner desired. Discouragingly, it is very hard currently to diagnose which cell in the system fails and it is expensive and not cost-effective to replace the whole pack of batteries. Besides, most leadacid batteries designed for the deep discharge are not sealed, and the regular maintenance is therefore required due to the gas emission caused by the water

Table 4.1 Typical marginal energy storage costs

Types

Overall cost

Electro-chemical storage Pumped storage Compressed air

[$400/KWh $80/KWh $1/KWh

42

H. Sun et al.

Table 4.2 Comparison between CAES and ECES CAES ECES Service life Efficiency Size Overall cost Maintenance

Long Not high Large depend on tank size Very cheap Need regular maintenance

Short Very high Large depend on cell number Very expensive Hard to overhaul, need regular maintenance

electrolysis while overcharged. Comparing with these characteristics of batteries, CAES only needs regular leakage test and oil maintenance. In brief, a comparison between CAES and ECES can be summarized in Table 4.2.

4.3 The Hybrid Wind Turbine System with CAES There are two possible system structures for a hybrid wind turbine system with compressed air energy storage; one has been demonstrated as an economically solution for utility-scale energy storage on the hours’ timescale. The energy storage system diagram is illustrated in Fig. 4.2. Such systems are successfully implemented in Hantorf in Germany, McIntosh in Alabama, Norton in Ohio, a municipality in Iowa, in Japan and under construction in Israel [12]. The CAES produces power by storing energy in the form of compressed air in an underground cavern. Air is compressed during off-peak periods, and is used on compensating the variation of the demand during the peak periods to generate power with a turbo-generator/gas turbine system. However, this system seems to be disadvantageous as it needs a large space to store compressed air, such as large underground carven for large scale power facilities. So this may limit its applications in terms of site installation. Besides all the above mentioned issues, large-capacity converter and inverter systems are neither cost effective nor power effective. For smaller capacity of wind turbines, this paper presents a novel hybrid technology to engage energy storage to wind power generation. As shown in Fig. 4.3, the electrical and pneumatic parts are connected through a mechanical transmission mechanism. This electromechanical integration offers simplicity of design, therefore, to ensure a higher efficiency and price quality. Also, the direct compensation of torque variation of the wind turbine will alleviate the stress imposed onto the wind turbine mechanical parts.

4.4 Modelling Study of the Hybrid Wind Turbine System For the proposed system illustrated in Fig. 4.3, the detailed mathematical model has been derived, which is used to have an initial test for the practicability of the whole hybrid system concept. At this stage, the system is designed to include a

4 Study on a Wind Turbine in Hybrid Connection

43

Fig. 4.2 Utility-scale CAES application’s diagram

Fig. 4.3 Small scale hybrid wind turbine with CAES

typical wind turbine with a permanent magnetic synchronous generator (PMSG), a vane type air motor and the associated mechanical power transmission system. The pneumatic system can be triggered to drive the turbine for power compensation during the low wind power period. The whole system mathematical model is developed and described below.

4.4.1 Mathematical Model of the Wind Turbine For a horizontal axis wind turbine, the mechanical power output P that can be produced by the turbine at the steady state is given by: 1 P ¼ qprT2 v3w Cp 2

ð4:1Þ

where q is the air density, vw is the wind speed, rT is the blade radius; Cp reveals the capability of turbine for converting energy from wind. This coefficient depends on the tip speed ratio k ¼ xT rT =vw and the blade angle, xT denotes the turbine speed. As this requires knowledge of aerodynamics and the computations are rather complicated, numerical approximations have been developed [13, 14]. Here the following function will be used, 12:5 116 0:4h 5 e ki ð4:2Þ Cp ðk; hÞ ¼ 0:22 ki

44

H. Sun et al.

with 1 1 0:035 ¼ ki k þ 0:08h h3 þ 1

ð4:3Þ

To describe the impact of the dynamic behaviors of the wind turbine, a simplified drive train model is considered. d 1 xT ¼ ðTT TL BxT Þ dt JT

ð4:4Þ

Where JT is the inertia of turbine blades, TT and TL mean the torque of turbine and low speed shaft respectively, B is the damping coefficient of the driven train system.

4.4.2 Modeling the Permanent Magnetic Synchronous Generator (PMSG) The model of a PMSG with pure resistance load (for simplicity of analysis) is formed of the following equations. For the mechanical part: d 1 xG ¼ ðTG Te FxG Þ dt JG

ð4:5Þ

dhG ¼ xG dt

ð4:6Þ

d 1 Rs Lq id ¼ vd id þ pxG iq Ld Ld dt Ld

ð4:7Þ

For the electrical part:

d 1 Rs Ld epxG iq ¼ vq iq pxG id dt Lq Lq Lq Lq Te ¼ 1:5p eiq þ ðLd Lq Þid iq i pﬃﬃﬃ 1h sinðphG Þ ð2Vab þ Vbc Þ þ 3Vbc cosðphG Þ 3 i pﬃﬃﬃ 1h Vd ¼ cosðphG Þ ð2Vab Vbc Þ 3Vbc sinðphG Þ 3 Vq ¼

ð4:8Þ ð4:9Þ ð4:10Þ ð4:11Þ

where, hG and xG are generator rotating angle and speed, F means the combined viscous friction of rotor and load, i is current, v means voltage, L is inductance, Rs is resistance of stator windings, p is the number of pole pairs of the generator,

4 Study on a Wind Turbine in Hybrid Connection

45

Fig. 4.4 Structure of a vane type air motor with four vanes

e is the amplitude of the flux induced by the permanent magnets of the rotor in the stator phases. While the subscripts a; b; c; d; q represent the axes of a; b; c; d; q for different electrical phases, respectively. The three-phase coordinates and d–q rotating frame coordinates can be transformed each other through Park’s transformation [15].

4.4.3 Model of the Vane-Type Air Motor Figure 4.4 shows the sketch of a vane-type air motor with four vanes. In this paper, input port 1 is supposed to be the inlet port, and then input port 1 will be outlet port. Compressed air is admitted through the input port 1 from servo valves and fills the cavity between the vanes, housing and rotor. The chamber A which is open to the input port 1 fills up under high pressure. Once the port is closed by the moving vane, the air expands to a lower pressure in a higher volume between the vane and the preceding vane, at which point the air is released via the input port 2. The difference in air pressure acting on the vane results in a torque acting on the rotor shaft [16, 17]. A simplified vane motor structure is shown in Fig. 4.5. The vane working radius measured from the rotor centre xa can be derived by: qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ xa ¼ e cos / þ R2m e2 sin2 / ð4:12Þ The volumes of chamber A and chamber B are derived as follows, and presented by the subscription a and b in this part equations. 1 1 Va ¼ L R2m r2 ðp þ /Þ þ Lm e2 sin 2u þ Lm eRm sin u 2 4

ð4:13Þ

1 1 Vb ¼ Lm R2m r2 ðp /Þ Lm e2 sin 2u Lm eRm sin up 2 4

ð4:14Þ

46

H. Sun et al.

Fig. 4.5 Schematic diagram of the structure of a vane-type air motor

Chamber B

Chamber A

r Rm

Xa 2e Φ

where, Rm is radius of motor body; e is eccentricity; Lm is vane active length in the axial direction, / is motor rotating angle, r means rotor radius. The pressure of chamber A and B can be derived [10]: kV_ a k P_ a ¼ Pa þ RTs Cd C0 Aa Xa f ðPa ; Ps ; Pe Þ Va Va

ð4:15Þ

kV_ b k P_ b ¼ Pb þ RTs Cd C0 Ab Xb f ðPb ; Ps ; Pe Þ Vb Vb

ð4:16Þ

where, Ts is supply temperature, R, Cd, C0 are air constant, A is effective port width of control valve, X is valve spool displacement, f is a function of the ratio between the downstream and upstream pressures at the orifice. The drive torque is determined by the difference of the torque acting on the vane between the drive and exhaust chambers, and is given by [13]: ð4:17Þ M ¼ ðPa Pb Þ e2 cos 2u þ 2eRm cos u þ R2m r2 L=2

4.4.4 Model of Mechanical Power Transmission The power transmission system, which is similar to a vehicle air conditioning system, includes the clutch and the belt speed transmission to ensure coaxial running, as shown in Fig. 4.6 [18]. The clutch will be engaged only when the turbine and air motor operate at the same speed to avoid mechanical damage to the system components. Even so, the system design still faces another challenge during the engagement, that is, the speed of air motor could not reach the speed as high as the turbine generator does, in most instances. Therefore, the two plates of belt transmission are designed in different diameters to play the function as a gearbox does. The main issue of modeling the power transmission is that two different configurations are presented: Case I Clutch disengaged: After the air motor started during the period before the two sides of electromagnetic clutch get the same speed, the clutch can be

4 Study on a Wind Turbine in Hybrid Connection

47

Fig. 4.6 The structure of the power transmission system in hybrid wind turbine

considered completely separated. While the scroll air motor is at the idle status with the inertia load of clutch friction plate. Considering friction and different payloads and applying Newton’s second law of angular motion, we have € M Mf /_ ¼ ðJa þ Jf Þ/

ð4:18Þ

where Ja is the air motor inertia, Jf is friction plate inertia, M is the drive torque, € represents the angular Mf is the friction coefficient, /_ is the angular velocity, / acceleration. Both the active plate and passive plate of the belt transmission can be considered as the generator inertia load, so the total equivalent inertia is Jtotal ¼ Jpass þ j2 Jact

ð4:19Þ

where Jpass and Jact is the inertia of passive and active plate respectively, and j is the speed ratio of the belt. Case II Clutch engaged: Once the angular velocity of the air motor /_ meets the speed of the active plate xG =i, the clutch will be engaged with the two sides. After the engagement, the active plate and friction plate can be assumed together to be one mass. The dynamic equations are as follows: 8 € > M Mf /_ Tact ¼ ðJa þ Jf þ Jact Þ/ > > > > > Tact g > > > < Tpass ¼ j dxG 1 > ¼ ðTH þ Tpass Te FxG Þ > > > dt JG þ Jpass > > > > x > : /_ ¼ G j where, TH is the input torque of wind turbine high speed shaft, g is the transfer efficiency of the belt. Choose system state variables to be x1 : pressure in the chamber A, x2 : pressure in the chamber B, x3 : rotated angle, x4 : angular speed, x5 : current in d axis,

48

H. Sun et al.

x6 : current in q axis. And input variables u1 : wind speed, u2 : input valve displacement. Combining the wind turbine, driven train and generator models together, the state functions of the whole hybrid wind turbine system can then be described by: kV_ a k x1 þ RTs Cd C0 Aa u2 f ðPa ; Ps ; Pe Þ Va Va kV_ b k x_ 2 ¼ x2 þ RTs Cd C0 Ab Xb f ðPb ; Ps ; Pe Þ Vb Vb x4 x_ 3 ¼ j ( 0 0 qpr2 u31 Cp 1 B x4 x_ 4 ¼ g g 02 0 2x5 j JG þ Jpass þ JT g0 þ ðJa þ Jf þ Jact Þ g2 x_ 1 ¼

j2

j

x 3 M Mf x 4 4 þ g g 2 Mc S pðex6 þ Ld x6 x5 Lq x6 x5 Þ Fx4 j j j 2

)

vd R s Lq x5 þ px4 x6 Ld Ld Ld vq Rs Ld epx5 x_ 6 ¼ x6 px4 x5 Lq L q Lq Lq x_ 5 ¼

where, g0 , j0 is the efficiency and speed ratio of wind turbine gearbox. With such a complicated structure of the system model, sometimes, it is difficult to obtain accurate values of system parameters. Intelligent optimization and identification methods have been proved to be an effective method to tackle this challenging problem [19, 20]. The test system for the proposed hybrid system structure is under development in the authors’ laboratory and the data obtained from the rig can be used to improve the model accuracy.

4.5 Simulation Study The model derived above for the proposed hybrid wind turbine system is implemented in MATLAB/SIMULINK environment to observe the dynamic behavior of the whole system as shown in Fig. 4.7. The simulation results are described below. The simulation considers the scenario when the input wind speed steps down within a 40 s’ time series observation window, that is, drops from 9 to 8 m/s at the time of 20 s and the whole simulation time period is 40 s (see Fig. 4.8). For comparison, the results from hybrid system using 6 bar supply pressure and those from stand-alone system without pneumatic actuators are shown in Fig. 4.9. It can be seen that the hybrid system can still obtain a high turbine speed due to the contribution of air motor output. It can also maintain a steady value even the natural wind speed decreases. Regrettably however, the power coefficient of

4 Study on a Wind Turbine in Hybrid Connection

49

Wind speed (m/s)

Fig. 4.7 The block diagram of the simulation system 10 9 8 7 6 5

0

5

10

15

20

25

30

35

40

Time(s)

Power coefficient

Fig. 4.9 Simulation results of wind turbine

Turbine speed (rad/s)

Fig. 4.8 Input wind speed

50 40 30 Hybrid

20

Stand-alone

10 0

0

5

10

15

20 25 Time (s)

30

35

40

0.6 0.4 Hybrid

0.2

Stand-alone

0

0

5

10

15

20

25

30

35

40

Time (s)

turbine falls because of the increased tip speed ratio k ¼ xT rT =vw . That should be considered as adverse effect of the hybrid system. Figure 4.10 provides a significant contrast between hybrid and independent status through generator operation. It can be seen that the power compensation can almost overcome the energy shortfall at the lower wind speed. Figure 4.11 reveals the simulation results of vane type air motor. The air motor started at the time of 20 s, and joined the wind turbine system rapidly owing to its fast response characteristic. It is worth noting that this type of air motor should

Fig. 4.10 Simulation results of the responses of the PMSG

H. Sun et al. Generator speed (rad/s)

50

250 200 150 100 50 0

Hybrid Stand-alone

0

5

10

15

20

25

30

35

40

Generator power (watt)

Time (s) 2500 2000 1500 1000 500 0

Hybrid Stand-alone

0

5

10

15

20

25

30

35

40

35

40

Fig. 4.11 Simulation results of vane type air motor

Air motor speed (rad/s)

Time (s)

100 50 0

0

5

10

15

20

25

30

Chamber pressure (pascal)

Time (s) 5

6

x 10

4

Pa Pb

2 0

0

5

10

15

20

25

30

35

40

Time (s)

generally running with well-marked periodic fluctuation, which is originated from the cyclically changed difference between Pa and Pb (the pressures in chamber A and chamber B). However, in hybrid system, the air motor operates rather smoothly which may be resulted from the large inertia of the whole system.

4.6 Concluding Remarks This paper presents a concise review on two types of energy storage technologies. A new concept of CAES applied to a small power scale wind turbine system is introduced. The complete process mathematical model is derived and implemented under MATLAB/SIMULINK environment. The simulation results are very encouraging as the extra power from the air motor output compensates the power shortfall from wind energy. This strategy enables the wind turbine to operate at a

4 Study on a Wind Turbine in Hybrid Connection

51

relatively uniformly distributed speed profile, which in turn will improve the operation condition of the overall system. The simple structure of the system and the advantage of CAES would provide the opportunities for such a system to be placed in the future renewable energy electricity market. The research in hybrid wind turbines is still on-going and further improvement is expected. Advanced tracking control strategy is a promising methodology and currently in consideration by the research team [7, 21]. Acknowledgments The authors would like to thank the support from ERDA/AWM for the support of Birmingham Science City Energy Efficiency & Demand Reduction project, China 863 Project (2009AA05Z212) and the scholarships for Hao Sun, Xing Luo from the University of Birmingham, UK.

References 1. Sinden G (2005) Wind power and UK wind resource. Environmental change institute, University of Oxford 2. Akhmatov V (2002) Variable-speed wind turbines with doubly-fed induction generators. Part II: power system stability. Wind Eng 26(3):71–88 3. Hansena AD, Michalke G (2007) Fault ride-through capability of DFIG wind turbines. Renew Energy 32:1594–1610 4. Cavallo A (2007) Controllable and affordable utility-scale electricity from intermittent wind resources and compressed air energy storage (CAES). Energy 32:120–127 5. Lemofouet S, Rufer A (2006) A hybrid energy storage system based on compressed air and supercapacitors with maximum efficiency point tracking (MEPT). IEEE Trans Ind Electron 53(4):1105–1115 6. Van der Linden S (2006) Bulk energy storage potential in the USA, current developments and future prospects. Energy 31:3446–3457 7. Wang J, Pu J, Moore P (1999) Accurate position control of servo pneumatic actuator systems: an application to food packaging. Control Eng Pract 7(6):699–706 8. Yang L, Wang J, Lu N et al (2007) Energy efficiency analysis of a scroll-type air motor based on a simplified mathematical model. In: The Proceedings of the World Congress on Engineering. London, pp 759–764, 2–4 July 2007 9. Wang J, Yang L, Luo X, Mangan S, Derby JW (2010) Mathematical modelling study of scroll air motors and energy efficiency analysis—Part I. IEEE-ASME Transactions on Mechatronics. doi:10.1109/TMECH.2009.2036608 10. Wang J, Luo X, Yang L, Shpanin L, Jia N, Mangan S, Derby JW (2010) Mathematical modelling study of scroll air motors and energy efficiency analysis—Part II. IEEE-ASME Transactions on Mechatronics. doi:10.1109/TMECH.2009.2036607 11. Price A (2009) The current status of electrical energy storage systems. ESA London Meeting, London, UK 12. Vongmanee V (2009) The renewable energy applications for uninterruptible power supply based on compressed air energy storage system. In: IEEE Symposium on Industrial Electronics and Applications (ISIEA 2009). Kuala Lumpur, Malaysia, 4–6 October 2009 13. Heier S (1998) Grid integration of wind energy conversion systems. Wiley, Chicheste 14. Sun H, Wang J, Guo S, Luo X (2010) Study on energy storage hybrid wind power generation systems. In: The Proceedings of the World Congress on Engineering 2010 WCE 2010, Vol II, London, UK, June 30–July 2, pp 833–838

52

H. Sun et al.

15. Pillay P, Krishnan R (1989) Modeling, simulation and analysis of permanent magnet motor drives, part 1: the permanent-magnet synchronous motor drive. IEEE Trans Ind Appl 25:265–273 16. Luo X, Wang J, Shpanin L, Jia N, Liu G, Zinober A (2008) Development of a mathematical model for vane-type air motors with arbitrary N vanes. In: International Conference of Applied and Engineering Mathematics, WCE, vol I–II. London, pp 362–367, July 2008 17. Wang J, Pu J, Moore PR, Zhang Z (1998) Modelling study and servo-control of air motor systems. Int J Control 71(3):459–476 18. Yeung YPB, Cheng KWE, Chan WW, Lam CY, Choi WF, Ng TW (2009) Automobile hybrid air conditioning technology. In: The Proceedings of the 3rd International Conference on Power Electronics Systems and Applications, p 116 19. Wei JL, Wang J, Wu QH (2007) Development of a multi-segment coal mill model using an evolutionary computation technique. IEEE Trans Energy Convers 22:718–727 20. Zhang YG, Wu QH, Wang J, Oluwanda G, Matts D, Zhou XX (2002) Coal mill modelling by machine learning based on on-site measurement. IEEE Trans Energy Convers 17(4):549–555 21. Wang J, Kotta U, Ke J (2007) Tracking control of nonlinear pneumatic actuator systems using static state feedback linearization of the input-output map. In: Proceedings of the Estonian Academy of Sciences-Physics Mathematics, vol 56, pp 47–66

Chapter 5

SAR Values in a Homogenous Human Head Model Levent Seyfi and Ercan Yaldız

Abstract The purpose of this chapter is to present how to determine and reduce specific absorption rate (SAR) on mobile phone user. Both experimental measurement technique and a numerical computing method are expressed here. Furthermore, an application on reduction of SAR value induced in human head is carried out with numerical computing. Mobile phone working at 900 MHz frequency shielded with copper is considered in order to furnish reduction of SAR in simulations which are conducted to calculate the maximum SAR values in Matlab programming language using two dimensional (2D) Finite Difference Time Domain (FDTD) method. Calculations are separately made for both 1 g and 10 g. Head model structure is assumed uniform.

5.1 Introduction Today mobile phone is one of the most widely used electronic equipments. What is more, it has a large number of users regardless of age. For this reason, designing of mobile phones which do not adversely affect human health is of great importance. The mobile phones are used mostly very close to ear as shown in Fig. 5.1. In this case, electromagnetic (EM) wave of mobile phone mainly radiates towards user’s head (that is, brain).

L. Seyfi (&) E. Yaldız Department of Electrical and Electronics Engineering, Selçuk University, Konya, Turkey e-mail: [email protected] E. Yaldız e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_5, Ó Springer Science+Business Media B.V. 2011

53

54

L. Seyfi and E. Yaldız

Fig. 5.1 Distribution of EM waves from a mobile phone on human head

Mobile phones communicate by transmitting radio frequency (RF) waves through base stations. RF waves are non-ionizing radiation which cannot break chemical bonds nor cause ionization in the human body. The operating frequencies of mobile phones can change depending on the country and the service provider between 450 and 2700 MHz. The RF radiation to a user mitigates rapidly while increasing distance from mobile phone. Using the phone in areas of good reception decreases exposure as it allows the phone to communicate at low power. A large number of studies have been performed over the last two decades to assess whether mobile phones create a potential health risk [1, 2]. To date, no adverse health effects have been established for mobile phone use [3]. Investigations of effects of mobile phones, other devices emitting EM waves on human health and measures against them have been still continued. As the results were evaluated, 1°C temperature increase of tissue cannot be removed in the circulatory system and this damages tissue. Limits for each frequency band were specified by the relevant institutions according to this criterion. Limits for the general public in the whole body average SAR value and in localized SAR value are 0.08 and 2 W/kg at 10 MHz–10 GHz frequency band, respectively [4]. SAR (W/kg) is the amount of the power absorbed by unit weight tissue. Measuring of SAR values in living cells is not experimentally possible. Specifically created model (phantom) and specialized laboratory test equipment are used for this. SAR values can be measured experimentally by placing probe into the phantom. The equipment consists of a phantom (human or box), precision robot, RF field sensors, and mobile phone holder, as shown in Fig. 5.2. The phantom is filled with a liquid that approximately represents the electrical properties of human tissue. Determination of SAR values can also be carried out with numerical calculations as an alternative to using the phantom [5–8]. In this case, the calculations are executed with simulations using electrical properties and physical dimensions of the typical human head. Mobile phones are manufactured within the limited SAR values. However, negative consequences may be seen in time due to placing them close to head during calling and due to long phone calls. In this case, it may be required to use with some precautions. For instance, a headset-microphone set can be used while calling. Alternatively, the attenuation of EM waves emitted from mobile phone towards user’s head by using the conductive material can be provided. Conductive material

5 SAR Values in a Homogenous Human Head Model

55

Fig. 5.2 Experimentally measuring SAR value with a phantom

mostly reflects the EM waves back. Hence, the amount of absorption of EM waves will be reduced to minimum level by placing the suitable sized conductive plate between the mobile phone’s antenna and the user head. To reduce SAR, some studies having different techniques has been introduced, too [9, 10]. In this chapter, 2D-FDTD technique, absorbing boundary conditions, and SAR calculation method are expressed. Additionally, a numerical application is presented. In the application, 2D simulations have been conducted to investigate reducing of SAR values in user head using copper plate. Simulations have been carried out in Matlab programming language using the 2D-FDTD method. First order Mur’s boundary condition have been used to remove artificial reflections naturally occurred in FDTD method.

5.2 2D-FDTD Method When Maxwell’s differential equations are considered, it can be seen that the change in the E-field in time is dependent on the change in the H-field across space. This results in the basic FDTD time-stepping relation that, at any point in space, the updated value of the E-field in time is dependent on the stored value of the E-field and the numerical curl of the local distribution of the H-field in space. Similar situation with above is present for the H-field. Iterating the E-field and H-field updates results in a marching-in-time process wherein sampled-data analogs of the continuous EM waves under consideration propagate in a numerical grid stored in the computer memory. Yee proposed that the vector components of the E-field and H-field spatially stagger about rectangular unit cells of a cartesian computational grid so that each E-field vector component is located midway between a pair of H-field vector components, and conversely [11, 12]. This scheme, now known as a Yee lattice, constructs the core of many FDTD software. The choices of grid cell size and time step size are very important in applying FDTD. Cell size must be small enough to permit accurate results at the highest operating frequency, and also be large enough to keep computer requirements manageable.

56

L. Seyfi and E. Yaldız

Cell size is directly affected by the materials present. The greater the permittivity or conductivity, the shorter the wavelength at a given frequency and the smaller the cell size required. The cell size must be much less than the smallest wavelength for which accurate results are desired. An often used cell size is k=10 or less at the highest frequency. For some situations, such as a very accurate determination of radar scattering cross-sections, k=20 or smaller cells may be necessary. On the other hand, good results are obtained with as few as four cells per wavelength. If the cell size is made much smaller than the Nyquist sampling limit, k ¼ 2Dx, is approached too closely for reasonable results to be obtained and significant aliasing is possible for signal components above the Nyquist limit. Once the cell size is selected, the maximum time step is determined by the Courant stability condition. Smaller time steps are permissible, but do not generally result in computational accuracy improvements except in special cases. A larger time step results in instability. To understand the basis for the Courant condition, consider a plane wave propagating through an FDTD grid. In one time step, any point on this wave must not pass through more than one cell, because during one time step, FDTD can propagate the wave only from one cell to its nearest neighbors. To determine this time step constraint, a plane wave direction is considered so that the plane wave propagates most rapidly between field point locations. This direction will be perpendicular to the lattice planes of the FDTD grid. For a grid of dimension d (where d = 1, 2, or 3), with all cell sides equal to Du, it is found that with v the maximum velocity of propagation in any medium in the problem, usually the speed of light in free space [13], Du vDt pﬃﬃﬃ d

ð5:1aÞ

for stability. If the cell sizes are not equal, it is as following for a 2-D and 3-D rectangular grid, respectively [14, 15]. sﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 1 1 þ vDt 1= ðDxÞ2 ðDyÞ2

ð5:1bÞ

sﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 1 1 1 þ þ vDt 1= 2 2 ðDxÞ ðDyÞ ðDzÞ2

ð5:1cÞ

where Dt is temporal increment and Dx; Dy; Dz, denoting sides of the cubic cell are spatial increments in the x, y, and z-direction, respectively. Firstly, although the real world is obviously 3D, many useful problems can be solved in two dimensions when one of the dimensions is much longer than the other two. In this case, it is generally assumed that the field solution does not vary in this dimension, which allows us to simplify the analysis greatly. In electromagnetics, this assumption permits us to decouple the Maxwell equations into two sets of fields or modes, and they are often called as: transverse magnetic and

5 SAR Values in a Homogenous Human Head Model

57

transverse electric. Any field subject to the assumption of no variation in z can be written as the sum of these modes: Transverse magnetic modes (TMz), contain the following field components: Ez(x, y, t), Hx(x, y, t) and Hy(x, y, t). Transverse electric modes (TEz), contain the following field components: Hz(x, y, t), Ex(x, y, t) and Ey(x, y, t). 2D TM mode is [16] oHx 1 oEz ¼ q 0 Hx ot oy l oHy 1 oEz 0 ¼ q Hy ot l ox oEz 1 oHy oHx ¼ rEz ot oy e ox

ð5:2aÞ ð5:2bÞ ð5:2cÞ

2D TE mode is oEx 1 oHz ¼ rEx ot e oy oEy 1 oHz ¼ rEy ot e oy oHZ 1 oEx oEy ¼ q0 Hz ot ox l oy

ð5:3aÞ ð5:3bÞ ð5:3cÞ

where l, q0 , e, and r are permeability, equivalent magnetic resistivity, permittivity, and conductivity, respectively. TM and TE modes are decoupled, namely, they contain no common field vector components. In fact, these modes are completely independent for structures comprised of isotropic materials. That is, the modes can exist simultaneously with no mutual interactions. Problems having both TM and TE excitation can be solved by a superposition of these two separate problems [14]. When 2D TM mode is discretized, FDTD formulas are nþ1=2 n1=2 n n Ez;i;jþ1 Hx;i;jþ1=2 ¼ Da Hx;i;jþ1=2 þ Db Ez;i;j

ð5:4aÞ

nþ1=2 n1=2 n n Ez;i;j Hy;iþ1=2;j ¼ Da Hy;iþ1=2;j þ Db Ez;iþ1;j

ð5:4bÞ

nþ1=2 nþ1=2 nþ1=2 nþ1=2 nþ1 n ¼ Ca Ez;i;j þ Cb Hy;iþ1=2;j Hy;i1=2;j þ Hx;i;j1=2 Hx;i;jþ1=2 ð5:4cÞ Ez;i;j

58

L. Seyfi and E. Yaldız

ð2 e r DtÞ ð2 e þ r DtÞ

ð5:5aÞ

ð2 DtÞ Dx ð2 e þ r DtÞ

ð5:5bÞ

ð2 l r DtÞ ð2 l þ r DtÞ

ð5:5cÞ

ð2 DtÞ Dxð2 l þ r DtÞ

ð5:5dÞ

Ca ¼ Cb ¼

Da ¼ Db ¼

where n denotes discrete time. In a programming language, there is no location like n ? 1/2. So these subscripts can be rounded to upper integer value [17], as followings. nþ1 n n n ¼ Da Hx;i;jþ1 þ Db Ez;i;j Ez;i;jþ1 Hx;i;jþ1

ð5:6aÞ

n1=2 nþ1 n n ¼ Da Hy;iþ1;j þ Db Ez;iþ1;j Ez;i;j Hy;iþ1;j

ð5:6bÞ

nþ1 n nþ1 nþ1 nþ1 nþ1 ¼ Ca Ez;i;j þ Cb Hy;iþ1;j Hy;i;j þ Hx;i;j Hx;i;jþ1 Ez;i;j

ð5:6cÞ

5.2.1 Perfectly Matched Layer ABC Perfect matched layer (PML) ABC is an absorbing material boundary condition which is firstly proposed by J.P. Berenger. PML is proven very effective, reflectionless to all impinging waves (polarization, angles), and is also reflectionless over a broad-band. According to Berenger PML technique, the computational area is surrounded by PML. The EM energy is absorbed rapidly in these layers so that perfect conductor can be set at the outmost. This can be also understood as that the interior area is matched to desired properties by the PML (Fig. 5.3). For TMz wave, Ez is split into Ezx and Ezy. And Faraday’s Law and Ampere’s Law break into four equations: oEzx oHy e þ rx Ezx ¼ ot ox

ð5:7aÞ

oEzy oHx þ ry Ezy ¼ ot oy

ð5:7bÞ

e

5 SAR Values in a Homogenous Human Head Model

59

PML(0,0,σy2, σ*y2) PML(σx1, σ*xv , σy2, σ*y2)

Fig. 5.3 The PML technique

PML(σx1, σ*x1,0,0)

PML(σx2, σ*x2, σy2, σ*y2)

PML(σx2,σ*x2,0,0) Wave Source

Vacuum PML(σx1, σ*x1, σy1,σ*y2) PML(σx2, σ*x2, σy1, σ*y2) PML(0,0, σy1, σ*y1) Perfect conductor

l

oHx oðEzx þ Ezy Þ þ ry Hx ¼ ot oy

ð5:7cÞ

oHy oðEzx þ Ezy Þ þ rx Hy ¼ ot ox

ð5:7dÞ

l

where r is equivalent magnetic conductivity. In PML area, the finite different equation is [18]:

Hxnþ1 ði þ 1=2; jÞ ¼ ery ðiþ1=2;jÞdt=l Hxn ði þ 1=2; jÞ

ð1 ery ðiþ1=2;jÞdt=lÞ ry ði þ 1=2; jÞd " nþ1=2 # nþ1=2 ði þ 1=2; j þ 1=2Þ Ezx ði þ 1=2; j þ 1=2Þ þ Ezy nþ1=2 nþ1=2 ði þ 1=2; j 1=2Þ Ezy ði þ 1=2; j 1=2 Ezx ð5:8aÞ

Hynþ1 ði; j þ 1=2Þ ¼ erx ði;jþ1=2Þdt=l Hyn ði; j þ 1=2Þ 1 erx ði:jþ1=2Þdt=l r ði; j þ 1=2Þd " xnþ1=2 # nþ1=2 ði 1=2; j þ 1=2Þ Ezx ði 1=2; j þ 1=2Þ þ Ezy : nþ1=2 nþ1=2 ði þ 1=2; j þ 1=2Þ Ezy ði þ 1=2; j þ 1=2Þ Ezx ð5:8bÞ nþ1=2 n1=2 Ezx ði þ 1=2; j þ 1=2Þ ¼ erx ðiþ1=2;jþ1=2Þdt=e Ezx ði þ 1=2; j þ 1=2Þ

ð1 erx ðiþ1=2;jþ1=2Þdt=e Þ rx ði þ 1=2; j þ 1=2Þd h i Hyn ði; j þ 1=2Þ Hyn ði þ 1; j þ 1=2Þ

ð5:8cÞ

60

L. Seyfi and E. Yaldız

nþ1=2 n1=2 Ezy ði þ 1=2; j þ 1=2Þ ¼ ery ðiþ1=2;jþ1=2Þdt=e Ezy ði þ 1=2; j þ 1=2Þ

ð1 ery ðiþ1=2;jþ1=2Þdt=e ry ði þ 1=2; j þ 1=2Þd

Hxn ði þ 1=2; j þ 1Þ Hxn ði þ 1=2; jÞ

ð5:8dÞ

In PML, the magnetic and electric conductivity is matched so that there is not any reflection between layers. The wave impedance matching condition is [19] r r ¼ e l

ð5:9Þ

5.2.2 Mur’s Absorbing Boundary Conditions Spurious wave reflections occur at the boundaries of computational domain due to nature of FDTD code. Virtual absorbing boundaries must be used to prevent the reflections there. Many Absorbing boundary conditions (ABCs) have been developed over the past several decades. Mur’s ABC is one of the most common ABCs. There are two types of Mur’s ABC to estimate the fields on the boundary, which are first-order and second-order accurate. Mur’s ABCs provide better absorption with fewer cells required between the object and the outer boundary, but at the expense of added complexity. The Mur’s absorbing boundaries are adequate and relatively simple to apply [13]. FDTD simulations have been carried out in two dimensions with first order Mur’s absorbing boundary conditions, therefore, they did not require a super computer system to perform. Considering the Ez component located at x = iDx, y = jDy for 2D case, the first order Mur’s estimation of Ez field component on the boundary is [17, 20] cDt Dx nþ1 nþ1 n n Ei1;j Ei;j ð5:10Þ ¼ Ei1;j þ Ei;j cDt þ Dx

5.3 Developed Program A program was developed in the Matlab programming language to examine the propagation of mobile phone radiation [21, 22]. Representation of the area analyzed in the program is shown in Fig. 5.4. Flow chart of the program is shown in Fig. 5.5. As shown in Fig. 5.5 firstly required input parameters of the program is entered by the user, and the area of analysis is divided into cells, and matrices are created for the electric and magnetic field components (E, H) calculated at each

5 SAR Values in a Homogenous Human Head Model Fig. 5.4 Representation of the 2D simulation area

61

yend

y2 y0,yT

y1

0 x0

x1 x2

xT

xH

xend

Fig. 5.5 Flow diagram for developed program

time step and each cell. Then, mathematical function of electric field emitted by mobile phone antenna is entered. Mur’s absorbing boundary conditions are applied to eliminate artificial reflections and loops are carried out to calculate the electric and magnetic field values by stepping in the position and the time in the part that can be called FDTD Cycle.

62

L. Seyfi and E. Yaldız

Fig. 5.6 Graphical interface of the developed program

The maximum electric field value is recorded at test point (T) for 1 or 10 g SAR. SAR values are calculated using the formula in Eq. 5.11 for each cell [23], and then 1 or 10 g averaged SAR is obtained by taking the average of them. SAR ¼

rjET j2 2q

ðW=kgÞ

ð5:11Þ

Here, r is average conductivity of the head, q is average mass density. ET is the maximum electric field calculated for the test point. A graphical interface has been designed for the developed program. This interface is shown in Fig. 5.6. All required data are entered here, and then the program is executed with the START button.

5.3.1 Input Parameters Simulations were performed for unshielded case by entering the electrical properties of free space in Shield Features part in the developed program’s graphical user interface and for shielded case by entering the electrical properties of copper (r = 5.8 9 107 S/m, 30 9 2 mm sized), separately. SAR was calculated at 8 cells

5 SAR Values in a Homogenous Human Head Model Table 5.1 Obtained SAR values from simulation results and shielding effectiveness values

1g 10 g

63

SAR (W/kg) without shield

SAR (W/kg) with copper shield

SE (dB)

0.7079 0.5958

0.0061 0.0060

-41.3 -39.9

for 1 g in the vicinity of test point, 80 cells for 10 g SAR. Output power of radiation source was assumed as constant during simulations. Average electrical conductivity of head in which SAR values were calculated was assumed as 0.97 S/m, the average mass density 1000 kg/m3, relative permittivity 41.5, and the diameter 180 mm at 900 MHz [24, 25]. Time increment and space increment parameters of FDTD simulations were selected as 2 ps and 1 mm, respectively.

5.3.2 Simulation Results 1 g and 10 g averaged SAR values were calculated for both of cases as given in Table 5.1. Shielding effectiveness (SE) was calculated using the obtained values from simulation results with Eq. 5.12. SE ¼ 20 log

s1 s2

ðdBÞ

ð5:12Þ

Here, S1 is the SAR value in shielded case, S2 is one in unshielded case. As shown in Table 5.1, SAR value decreased from 0.7079 W/kg to 0.0061 W/kg for 1 g averaged case and from 0.5958 to 0.0060 W/kg for 10 g averaged case under the effect of copper shield.

5.4 Conclusion In this chapter, some information about mobile phones, their possible health risks, the parameter of SAR, its calculation and experimental measurement method, and numerical computing technique (2D-FDTD method) are expressed. In the application given in this chapter, reduction of radiation towards user from mobile phone with copper shield at 900 MHz frequency was investigated by calculating the SAR values in some simulations. The reason for choosing 2D and Mur’s boundary condition in the simulation is to keep computer memory and processor requirements at minimum level. 1 and 10 g averaged SAR values were separately computed. In the simulations, shielding effectiveness was calculated using estimated SAR values for shielded and unshielded conditions. As a result of simulations, it was found that the SAR values affecting mobile phone user were reduced about 40 dB by using copper shield. Acknowledgments This work was supported by scientific research projects (BAP) coordinating office of Selçuk University.

64

L. Seyfi and E. Yaldız

References 1. Health Projection Agency [Online] Available: http://www.hpa.org.uk/Topics/Radiation/ UnderstandingRadiation/UnderstandingRadiationTopics/ElectromagneticFields/MobilePhones/ info_HealthAdvice/ 2. Australian Radiation Protection and Nuclear Safety Agency [Online] Available: http://www. arpansa.gov.au/mobilephones/index.cfm 3. World Health Organization [Online] Available: http://www.who.int/mediacentre/factsheets/ fs193/en/index.html 4. Ahlbom A, Bergqvist U, Bernhardt JH, Ce´sarini JP, Court LA (1998) Guidelines for limiting exposure to time-varying electric, magnetic, and electromagnetic fields (up to 300 GHz). Health Phys Soc 74(4):494–522 5. Kua L-C, Chuang H-R (2003) FDTD computation of fat layer effects on the SAR distribution in a multilayered superquadric-ellipsoidal head-model irradiated by a dipole antenna at 900/ 1800 MHz. In: IEEE International Symposium on Electromagnetic Compatibility 6. Kuo L-C, Lin C-C, Chunng H-R (2004) FDTD computation offat layer effects on SAR distribution in a multilayered superquadric-ellipsoidal head model and MRI-based heads proximate to a dipole antenna. Radio Science Conference, Proceedings. Asia-Pacific, August 2004 7. Chen H-Y, Wang H-H (1994) Current and SAR induced in a human head model by the electromagnetic fields irradiated from a cellular phone. IEEE Trans Microwave Theory Tech 42(12):2249–2254 8. Schiavoni A, Bertotto P, Richiardi G, Bielli P (2000) SAR generated by commercial cellular phones-phone modeling, head modeling, and measurements. IEEE Trans Microwave Theory Tech 48(11):2064–2071 9. Kusuma AH, Sheta A-F, Elshafiey I, Alkanhal M, Aldosari S, Alshebeili SA (2010) Low SAR antenna design for modern wireless mobile terminals. In: STS International Conference, January 2010 10. Islam MT, Faruque MRI, Misran N (2009) Reduction of specific absorption rate (SAR) in the human head with ferrite material and metamaterial. Prog Electromagn Res C 9:47–58 11. Yee K (1966) Numerical solution of initial boundary value problems involving Maxwell’s equations in isotropic media. IEEE Trans Antennas Propag 14:302–307 12. Taflove A, Brodwin ME (1975) Numerical solution of steady-state electromagnetic scattering problems using the time-dependent Maxwell’s equations. IEEE Trans Microwave Theory Tech 23:623–630 13. Kunz KS, Luebbers RJ (1993) The finite difference time domain method for electromagnetism. CRC Press, Boca Raton 14. Stutzman WL, Thiele GA (1998) Antenna theory. Wiley, New York 15. Isaacson E, Keller HB (1967) Analysis of numerical methods. Wiley, New York 16. Davidson DB (2005) Computational electromagnetics for RF and microwave engineering. Cambridge University Press, Cambridge 17. Seyfi L, Yaldiz E (2006) Shielding analysis of mobile phone radiation with good conductors. In: Proceedings of the International Conference on Modeling and Simulation, vol 1, pp 189–194 18. Sadiku MNO (2000) Numerical techniques in electromagnetic, 2nd edn. CRC Press, Boca Raton 19. Berenger JP (1994) Matched layer for the absorption of electromagnetic waves. J Comp Phys 114:185–200 Aug 20. Mur G (1981) Absorbing boundary conditions for the finite-difference approximation of the time-domain electromagnetic field equations. IEEE Trans Electromag Compat 23:377–382 21. Seyfi L, Yaldız E (2010) Numerical computing of reduction of SAR values in a homogenous head model using copper shield. Lecture notes in engineering and computer science: Proceedings of the World Congress on Engineering 2010, WCE 2010, 30 June–2 July 2010, London, UK, pp 839–843

5 SAR Values in a Homogenous Human Head Model

65

22. Seyfi L, Yaldız E (2008) Simulation of reductions in radiation from cellular phones towards their users. In: First International Conference on Simulation Tools and Techniques for Communications. Networks and Systems, Marseille, France, March 2008 23. Foster KR, Chang K (eds) (2005) Encyclopedia of RF and microwave engineering, vol 1. Wiley-Interscience, Hoboken 24. Moustafa J, Abd-Alhameed RA, Vaul JA, Excell PS, McEwen NJ (2001) Investigations of reduced SAR personal communications handset using FDTD. In: Eleventh International Conference on Antennas and Propagation (IEE Conf Publ No 480), 17–20 April, pp 11–15 25. FCC, OET Bulletin 65c (2001) Evaluating compliance with FCC Guidelines for human exposure to radio frequency electromagnetic fields, [Online] Available: http://www.fcc.gov/ Bureaus/Engineering_Technology/Documents/bulletins/oet65/oet65c.pdf

Chapter 6

Mitigation of Magnetic Field Under Overhead Transmission Line Adel Zein El Dein Mohammed Moussa

Abstract The chapter presents an efficient way to mitigate the magnetic field resulting from the three-phase 500 kV single circuit overhead transmission line existing in Egypt, by using a passive loop conductor. The aim of this chapter is to reduce the amount of land required as right-of-way. The chapter used an accurate method for the evaluation of 50 Hz magnetic field produced by overhead transmission lines. This method is based on the matrix formalism of multiconductor transmission lines (MTL). This method obtained a correct evaluation of all the currents flowing in the MTL structure, including the currents in the subconductors of each phase bundle, the currents in the ground wires, the currents in the mitigation loop, and also the earth return currents. Furthermore, the analysis also incorporates the effect of the conductors sag between towers, and the effect of sag variation with the temperature on the calculated magnetic field. Good results have been obtained and passive loop conductor design parameters have been recommended for this system at ambient temperature (35°C).

6.1 Introduction The rabid increase in HV transmission lines and irregular population areas near the manmade sources of electrical and magnetic fields, in Egypt, needs a suggestion of methods to minimize or eliminate the effect of magnetic and electrical fields on human begins in Egyptian environmental areas especially in irregular areas.

A. Z. El Dein Mohammed Moussa (&) Department of Electrical Engineering, High Institute of Energy, South Valley University, Aswan, 81258, Egypt e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_6, Ó Springer Science+Business Media B.V. 2011

67

68

A. Z. El Dein Mohammed Moussa

Public concern about magnetic field effects on human safety has triggered a wealth of research efforts focused on the evaluation of magnetic fields produced by power lines [1–4]. Studies include the design of new compact transmission line configurations; the inclusion of auxiliary single or double lops for magnetic field mitigation in already existing power lines; the consideration of series-capacitor compensation schemes for enhancing magnetic field mitigation; the reconfiguration of lines to high phase operation, etc. [5–7]. However, many of the studies presented that deal with power lines make use of certain simplifying assumptions that, inevitably, give rise to inaccurate results in the computed magnetic fields. Ordinary simplifications include neglecting the earth currents, neglecting the ground wires, replacing bundle phase conductors with equivalent single conductors, and replacing actual sagged conductors with average height horizontal conductors. These assumptions result in a model where magnetic fields are distorted from those produced in reality [8, 9]. In this chapter, a matrix-based MTL model [10], where the effects of earth currents, ground wire currents and mitigation loop current are taken into account, is used; moreover, actual bundle conductors and conductors’ sag at various temperatures are taken into consideration. The results from this method without mitigation loop are compared with those produced from the common practice method [8, 9] for magnetic field calculation where the power transmission lines are straight horizontal wires of infinite length, parallel to a flat ground and parallel with each other. Then the optimal design parameters of the mitigation loop for system under study are obtained.

6.2 Computation of System Currents The MTL technique is used in this chapter for the simple purpose of deriving the relationship among the line currents of an overhead power line. This method is explained in [10], this chapter reviews and extends this method for Egyptian 500 kV overhead transmission line, with an other formula for the conductors’ sag, taken into account the effect of temperature on the sag configuration [11]. The first step required to conduct a correct analysis consists in determination of all system currents based on prescribed phase-conductor currents Ip : Ip ¼ ½I1 ; I2 ; I3

ð6:1Þ

Consider the frequency-domain transmission line matrix equations for nonuniform MTLs (allowing the inclusion of the sag effect)

dV ¼ Z 0 ðx; zÞI dz

ð6:2aÞ

dI ¼ Y 0 ðx; zÞV dz

ð6:2bÞ

6 Mitigation of Magnetic Field Under Overhead Transmission Line

69

Where Z0 and Y0 , denote the per-unit-length series-impedance and shuntadmittance matrices, respectively, V and I are complex column matrices collecting the phasors associated with all of the voltages and currents of the line conductors, respectively. 3 ½Va 1np 7 6 ½V 6 b 1np 7 7 6 V ¼ 6 ½Vc 1np 7 7 6 4 ½VG 1nG 5 ½VL 1nL 2

3 ½Ia 1np 7 6 ½I 6 b 1np 7 7 6 and I ¼ 6 ½Ic 1np 7 7 6 4 ½IG 1nG 5 ½IL 1nL 2

ð6:3Þ

In (6.3), subscripts a, b, and c refer to the partition of phase bundles into three sub-conductor sets. Subscript G refers to ground wires and L subscript refers to the mitigation loop. In (6.3) np, nG, and nL denote, the number of phase bundles, the number of ground wires, and the number of conductors in the mitigation loop, respectively, for the Egyptian 500 kV overhead transmission line it is seen that: np = 3, nG = 2, and nL = 2 as it is proposed in this chapter. Since the separation of the electric and magnetic effects is an adequate approach for quasistationary regimes (50 Hz), where wave-propagation phenomena are negligible, all system currents are assumed to be Z independent. This means the transversal displacement currents among conductors are negligible or, in other words, (6.2b) equates to zero and only Z0 values are needed to calculate. Since the standard procedure for computing Z0 in (6.2a) has been established elsewhere [12–14], details will not be revealed here and thus only a brief summary is presented. Z 0 ¼ jxL þ ZE þ Zskin

ð6:4Þ

The external-inductance matrix is a frequency-independent real symmetric matrix whose entries are: Lkk ¼

lo 2yk ln 2p rk

ð6:5aÞ

Lkk ¼

lo ðyi þ yk Þ2 þ ðxi þ xk Þ2 ln 4p ðyi yk Þ2 þ ðxi þ xk Þ2

ð6:5bÞ

Where rk denotes conductor radius, and yk and xk denote the vertical and horizontal coordinates of conductor k. Matrix ZE, the earth impedance correction, is a frequency dependent complex matrix whose entries can be determine using Carson’s theory or, alternatively, the Dubanton complex ground plane approach [12–14]. The entries of ZE are defined as: ðZE Þkk ¼ jx

lo P ln 1 þ 2p yk

ð6:6aÞ

70

A. Z. El Dein Mohammed Moussa

Fig. 6.1 Linear dimensions which determine parameters of the catenary

2 þ ðxi xk Þ2 l ðyi þ yk þ 2PÞ ðZE Þik ¼ jx o ln 2 4p ðyi yk Þ þ ðxi xk Þ2

! ð6:6bÞ

the complex depth, is given by P ¼ ðjxlo =qÞ1=2 with q denoting the where P, earth resistivity. Matrix Zskin is a frequency-dependent complex diagonal matrix whose entries can be determined by using the skin-effect theory results for cylindrical conductors [9]. For low-frequency situations, it will be: ðZskin Þkk ¼ ðRdc Þk þ jx

lo 8p

ð6:7Þ

Where ðRdc Þk denotes the per-unit-length dc resistance of conductor k. Due to the line conductors’ sag between towers; yk will be a function on the distance z between the two towers, also the entries for L and ZE, defined in (6.5a, b) and (6.6a, b), vary along the longitudinal coordinate z. The exact shape of a conductor suspended between two towers of equal height can be described by such parameters; as the distance between the points of suspension span, d, the sag of the conductor, S, the height of the lowest point above the ground, h, and the height of the highest point above the ground, hm. These parameters can be used in different combinations [13, 14]. Figure 6.1 depicts the basic catenary geometry for a singleconductor line, this geometry is described by: yk ¼ hk þ 2ak sinh

2

z 2ak

ð6:8Þ

Where ak is the solution of the transcendental equation: 2½ðhmk hk Þ=dk uk ¼ sinh2 ðuk Þ, for conductor k; with uk ¼ dk =ð4ak Þ. The parameter ak is also associated with the mechanical parameters of the line: ak ¼ ðTh Þk =wk where ðTh Þk is the conductor tension at mid-span and wk is weight per unit length of the conductor k. Consider a mitigation loop of length l, is present, where l is a multiple of the span length d. The line section under analysis has its near end at -l/2 and its far end at l/2. The integration of (6.2a) from z = -l/2 to z = l/2 gives:

6 Mitigation of Magnetic Field Under Overhead Transmission Line

Zl=2

Vnear Vfar ¼ I

71

Z 0 ðzÞdz

ð6:9aÞ

l=2

Equation 6.9a can be written explicitly, in partitioned form, as: 2

3

2 Zaa 6 7 6 DVb 7 6 Zba 6 7 6 6 DVc 7 ¼ 6 Zca 6 7 6 6 7 4 ZGa 4 DVG 5 ZLa DVL DVa

Zab Zbb Zcb ZGb ZLb

Zac Zbc Zcc ZGc ZLc

ZaG ZbG ZcG ZGG ZLG

32 3 ZaL Ia 6 Ib 7 ZbL 7 76 7 6 7 ZcL 7 76 Ic 7 5 ZGL 4 IG 5 ZGL IL

ð6:9bÞ

The computation of the bus impedance Z in Eq. 6.9a, b is performed using the following formula: Zl=2

Z¼

Z 0 ðzÞdz

ð6:10Þ

l=2

Where values for Z0 are evaluated from Eqs. 6.4–6.7 considering the conductors’ heights given by (6.8). The two-conductor mitigation loop is closed and may include or not a series capacitor of impedance Zc [7]. In any case, the submatrix IL in (6.3) has the form:

I IL ¼ L1 IL2

¼ I L ST

ð6:11Þ

where S ¼ ½ 1 1 . By using the boundary conditions at both the near and far end of the line section, the voltage drop in the mitigation loop will be:

DVL1 DVL ¼ DVL2

VL1 ¼ VL2

V L1 V L2 near

DVL1 ¼ DV L1 þ Zc I L far

which can be written as: SDVL ¼ DVL1 DVL2 ¼ Zc I L

ð6:12Þ

Where I L is the loop current, and Zc ¼ jXs ¼ 1=ðjxCs Þ is the impedance of the series capacitor included in the loop. Using (6.12), the fifth equation contained in (6.9b) allows for the evaluation of the currents flowing in the mitigation loop. Using Eq. 6.12, the fifth equation contained in Eq. 6.9b allows for the evaluation of the currents flowing in the mitigation loop.

72

A. Z. El Dein Mohammed Moussa

IL ¼

IL I L

¼ YST SZLa Ia YST SZLb Ib YST SZLc Ic YST SZLG IG |ﬄﬄﬄﬄ{zﬄﬄﬄﬄ} |ﬄﬄﬄﬄ{zﬄﬄﬄﬄ} |ﬄﬄﬄﬄ{zﬄﬄﬄﬄ} |ﬄﬄﬄﬄﬄ{zﬄﬄﬄﬄﬄ} KLa

KLb

KLc

ð6:13Þ

KLG

Where; Y ¼ ðZc þSZ1 LL ST Þ. Taking into account that the conductors belonging to given phase bundle are bonded to each other, and that ground wires are bonded to earth (tower resistances neglected), that result in: DVa ¼ DVb ¼ DVc and DVG ¼ 0. By using DVG ¼ 0 in the fourth equation contained in (6.9b) and using Eq. 6.13, the ground wire will be: IG ¼ YG ðZGa ZGL KLa Þ Ia þ YG ðZGb ZGL KLb Þ Ib þ YG ðZGc ZGL KLc Þ Ic |ﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄ{zﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄ} |ﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄ{zﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄ} |ﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄ{zﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄﬄ} KGa

KGb

KGc

ð6:14Þ Where; YG ¼ ðZGL KLG ZGG Þ1 . Next, by using (6.13) and (6.14), IL and IG can be eliminated in (6.9b), yielding a reduced-order matrix problem 2

3 2 DVa Z^aa 4 DVb 5 ¼ 4 Z^ba DVc Z^ca

^ab Z ^bb Z Z^cb

32 3 Ia Z^ab Z^bc 54 Ib 5 Ic Z^cc

ð6:15Þ

Where; Z^aa ¼ Zaa þ ZaG KGa ZaL ðKLa þ KLG KGa Þ Z^ab ¼ Zab þ ZaG KGb ZaL ðKLb þ KLG KGb Þ Z^ac ¼ Zac þ ZaG KGc ZaL ðKLc þ KLG KGc Þ Z^ba ¼ Zba þ ZbG KGa ZbL ðKLa þ KLG KGa Þ Z^bb ¼ Zbb þ ZbG KGb ZbL ðKLb þ KLG KGb Þ Z^bc ¼ Zbc þ ZbG KGc ZbL ðKLc þ KLG KGc Þ Z^ca ¼ Zca þ ZcG KGa ZcL ðKLa þ KLG KGa Þ Z^cb ¼ Zcb þ ZcG KGb ZcL ðKLb þ KLG KGb Þ Z^cc ¼ Zcc þ ZcG KGc ZcL ðKLc þ KLG KGc Þ The relationship between Ia, Ib and Ic is obtained from (6.15) by making DVa ¼ DVb ¼ DVc and by using Ia þ Ib þ Ic ¼ Ip . Then the following relations are obtained by: Ia ¼ KKac ðKKac þ Kbc þ 1Þ1 Ip

ð6:16aÞ

Ib ¼ KKbc ðKKac þ Kbc þ 1Þ1 Ip

ð6:16bÞ

Ic ¼ ðKKac þ Kbc þ 1Þ1 Ip

ð6:16cÞ

6 Mitigation of Magnetic Field Under Overhead Transmission Line Table 6.1 Temperature effect

Temperature (°C) Sag

15 7.3

20 7.8

73

25 8.3

30 8.8

35 9.3

40 9.8

45 10.3

Where; KKac ¼ KKab KKbc þ KKac ; Kab ¼ Ya Z^bb Z^ab ; Kac ¼ Ya Z^bc Z^ac ; 1 Ya ¼ Z^aa Z^ba ; Kbc ¼ ðKbc1 Þ1 Kcb1 ; Kbc1 ¼ Z^ca Kab þ Z^cb Z^ba Kab Z^bb ; and Kcb1 ¼ Z^ba Kac þ Z^bc Z^ca Kac Z^cc : Once IP is given, all of the overhead conductor currents Ia, Ib, Ic, IG and IL can be evaluated, step after step using (6.13), (6.14), and (6.16a–c). The net current returning through the earth IE is the complement of the sum of all overhead conductor currents. " n # np np nG p nL X X X X X ðIa Þk þ ðIb Þk þ ðIc Þk þ ðIG Þk þ ðIL Þk IE ¼ ð6:17Þ k¼1

k¼1

k¼1

k¼1

k¼1

The sag of each conductor depends on individual characteristics of the line and environmental conditions. By using the Overhead Cable Sag Calculation Program [15], the variation of sag with temperatures can be calculated as in Table 6.1. Once all system currents are calculated, the magnetic field at any point, which produced from these currents, can be calculated.

6.3 Magnetic Field Calculations By using the Integration Technique, which explained in details in [16] and reviewed here, the magnetic field produced by a multiphase conductors (M), and their images, in support structures at any point P(xo, yo, zo) can be obtained by using the Biot–Savart law as [9, 16]: Zd=2 M X N

1 X ðHx Þk~ ax þ ðHy Þk~ ay þ ðHz Þk~ az dz: Ho ¼ 4p K¼1 n¼N

ð6:18Þ

d=2

h

ðHx Þk ¼

Ik ðz zo þ ndÞ sinh

z ak

ðyk yo Þ

i

dk h i Ik ðz zo þ ndÞ sinh azk ðyk þ yo þ 2PÞ dk0 ðHy Þk ¼

Ik ðxk xo Þ Ik ðxk xo Þ dk dk0

ð6:19Þ ð6:20Þ

74

A. Z. El Dein Mohammed Moussa

ðHz Þk ¼

Ik ðxk xo Þ sinh dk

z ak

þ

Ik ðxk xo Þ sinh

z ak

dk0

ð6:21Þ

h i3=2 dk ¼ ðxk xo Þ2 þ ðyk yo Þ2 þ ðz zo þ ndÞ2

ð6:22Þ

h i3=2 2 þ ðz zo þ ndÞ2 dk0 ¼ ðxk xo Þ2 þ ðyk þ yo þ 2PÞ

ð6:23Þ

The parameter (N) in (6.18) represents the number of spans to the right and to the left from the generic one where K = 0 as shown in Fig. 6.1.

6.4 Results and Discussion The data used in the calculation of the magnetic field intensity at points one meter above ground level (field points), under Egyptian 500 kV TL single circuit are as presented in Table 6.2. The phase-conductor currents are defined by a balanced direct-sequence three-phase set of 50 Hz sinusoidal currents, with 2-kA rms, that is: Ip ¼ 2½1; expðj2p=3Þ; expðj2p=3Þ kA

ð6:24Þ

Figure 6.2 shows the effect of the number of spans (N) on the calculated magnetic field intensity. It is noticed that, when the magnetic field intensity calculated at point P1 (Fig. 6.1) and a distance away from the center phase, the effect of the spans’ number is very small due to the symmetry of the spans around the calculation points, as explained in Fig. 6.1, where the contributions of the catenaries d1 and d2 are equal and smaller than the contribution of the catenary d, as Table 6.2 Characteristics of 500 kV line conductors Conductor number Radius (mm) X-Coordinate (m) Y-Coordinate (m) Rdc at 20°C (X/km) 1a 1b 1c 2a 2b 2c 3a 3b 3c G1 G2 L1 L2

15.3 15.3 15.3 15.3 15.3 15.3 15.3 15.3 15.3 5.6 5.6 11.2 11.2

-13.425 -12.975 -13.2 -0.225 0.225 0 12.975 13.425 13.2 -8 8 -13.2 13.2

22.13 22.13 21.74 24.48 24.48 24.09 22.13 22.13 21.74 30 30 17 17

0.0511 0.0511 0.0511 0.0511 0.0511 0.0511 0.0511 0.0511 0.0511 0.564 0.564 0.1168 0.1168

6 Mitigation of Magnetic Field Under Overhead Transmission Line 30 Magnetic Field Intensity (A/m)

Fig. 6.2 The effect of the spans’ numbers on the magnetic field intensity

at P1, single span: N=0

25

at P1, N=1,2,3 at P2, single span: N=0

20

at P2, N=1,2,3

15 10 5 0

0

5

10 15 20 25 30 Distance from the center phase (m)

35

40

30 Magnetic Field Intensity (A/m)

Fig. 6.3 The effect of the temperatures on the magnetic field intensity

75

P1 at 45 deg.

25

P1 at 15 deg. P2 at 45 deg.

20

P2 at 15 deg.

15 10 5 0

0

5

10 15 20 25 30 Distance from the center phase (m)

35

40

they far from the field points. But when the magnetic field intensity calculated at point P2 (6.1) and a distance away from the center phase, the effect of the spans’ number is of great effect (double), that due to the contribution of the catenary d2 which produced the same magnetic field intensity as the original span d in this case as explained in Fig. 6.1, and of course the catenary d1 have a small contribution in the calculated values of the magnetic field intensities in this case. Figure 6.3 shows the effects of the temperatures on the configuration of overhead transmission line conductors (sag) and hence on the calculated magnetic field intensity by using 3D integration technique with MTL technique. It is seen that as the sag increased with the increase in the temperatures (as indicated in Table 6.1), the magnetic field intensity also increased. Figure 6.4 shows the comparison between the magnetic filed calculated with both 2D straight line technique where the average conductors’ heights are used, and 3D integration technique with MTL technique. It is seen that the observed maximum error of -23.2959% (at point P1) and 49.877% at (point P2) is mainly due to the negligence of the sag effect on the conductors. Figure 6.5 shows the comparison between the magnetic field intensity calculated by using 3D integration technique with MTL technique with and without ground wires and with and without the short circuit mitigation loop. It is seen that, the observed maximum reduction of 1.9316% (at point P1) and 2.469%

76 30 Magnetic Field Intensity (A/m)

Fig. 6.4 Comparison between results of 2D and 3D techniques

A. Z. El Dein Mohammed Moussa

P1, at N=2 2D, Average Hights P2, at N=2

25 20 15 10 5 0 0

35

40

P1: Cond P1: Cond+Ground

25

P1:Cond+Ground+Mit.SC P2: Cond

20

P2: Cond+Ground P2:Cond+Ground+Mit.SC

15 10 5 0 –40

–30

–20 –10 0 10 20 Distance from the center phase (m)

30

40

50 Magnetic Field Intensity (A/m)

Fig. 6.6 Effect of the reactance Xs, inserted in the mitigation loop, on the calculated magnetic field

10 15 20 25 30 Distance from the center phase (m)

30 Magnetic Field Intensity (A/m)

Fig. 6.5 Comparison between the calculated magnetic field from the conductors only; the conductors and ground wires; and the conductors, ground wires and S. C. mitigation loop

5

P1

45

P2

40 35 30 25 20 15 10 5

–2

–1.8

–1.6

–1.4

–1.2 –1 –0.8 Xs (Ohm)

–0.6

–0.4

–0.2

0

(at point P2) is mainly due to the negligence of the ground wires. It is seen that with the short circuit mitigation loop placed 5 m below beneath the outer phase conductors, the magnetic field intensity reduced to a significant values, maximum reduction of 25.7063% (at point P1) and 30.1525% (at point P2). The magnetic field intensity can be reduced further by inserting an appropriately chosen series

6 Mitigation of Magnetic Field Under Overhead Transmission Line

77

Table 6.3 The effect of the mitigation loop heights on the calculated magnetic field intensity at point (P1) and 26.4 m mitigation loop spacing Height of mitigation loop Magnetic field (A/m) at P1 at distance from center phase equals 18 m Short circuit With optimal 19 m Short circuit With optimal 20 m Short circuit With optimal 21 m Short circuit With optimal 23 m Short circuit With optimal 24 m Short circuit With optimal 25 m Short circuit With optimal 26 m Short circuit With optimal 27 m Short circuit With optimal

-15 m

-10 m

0m

10 m

15 m

capacitance

15.03 9.42

17.77 10.80

20.83 17.52

19.74 18.84

16.83 16.1

capacitance

14.93 8.88

17.76 10.82

20.45 17.12

19.71 18.77

16.78 15.98

capacitance

14.64 8.13

17.52 10.43

19.94 16.63

19.49 18.64

16.57 15.84

capacitance

14.19 7.01

17.06 9.56

19.26 15.87

19.01 18.15

16.13 15.40

capacitance

14.10 7.07

17.01 9.86

19.00 16.35

19.31 19.16

16.43 16.41

16.64 capacitance 11.46

19.80 14.13

21.19 17.97

21.33 19.79

18.24 16.96

18.03 capacitance 13.95

21.33 16.87

22.46 19.55

22.53 20.85

19.31 17.91

18.95 capacitance 15.70

22.34 18.764

23.34 20.82

23.35 21.80

20.04 18.74

19.61 capacitance 16.98

23.08 20.16

24.00 21.84

23.96 22.58

20.58 19.42

capacitor in the mitigation loop, in order to determine the optimal capacitance Cs of the capacitor to be inserted in the mitigation loop, the magnetic field intensity calculated at point one meter above ground surface under center phase, considering different values of Zc where Zc = jXs, with the reactance Xs varies from -2X to 0. Figure 6.6 shows the graphical results of the effect of the reactance Xs, inserted in the mitigation loop, on the magnetic field intensity, from which it is seen that the optimal situation (minimum value of magnetic field intensity) is characterized by Cs = 4.897 mF, and worst situation (maximum value of magnetic field intensity) is characterizes by Cs = 2.358 mF. (Tables 6.3 and 6.4) depict the effect of the mitigation loop height on the calculated magnetic field intensity at points P1 and P2, respectively, when the mitigation loop spacing is 26.4 m (exactly under the outer phases). It is seen that the optimal height is one meter below the outer phase

78

A. Z. El Dein Mohammed Moussa

Table 6.4 The effect of the mitigation loop heights on the calculated magnetic field intensity at point (P2) and 26.4 m mitigation loop spacing Height of Mitigation loop Magnetic Field (A/m) at P2 at distance from center phase equals -15 m 18 m Short circuit With optimal 19 m Short circuit With optimal 20 m Short circuit With optimal 21 m Short circuit With optimal 23 m Short circuit With optimal 24 m Short circuit With optimal 25 m Short circuit With optimal 26 m Short circuit With optimal 27 m Short circuit With optimal

-10 m

0m

10 m

15 m

7.97 capacitance 5.49

8.89 6.28

9.88 7.77

9.43 7.99

8.61 7.44

7.77 capacitance 5.15

8.7 5.98

9.68 7.53

9.26 7.85

8.44 7.30

7.48 capacitance 4.67

8.4 5.54

9.37 7.21

9.00 7.64

8.20 7.12

7.09 capacitance 3.98

7.99 4.88

8.93 6.67

8.61 7.25

7.84 6.76

6.79 capacitance 3.91

7.71 4.93

8.72 6.98

8.49 7.72

7.72 7.22

8.06 capacitance 5.52

9.09 6.49

10.06 8.01

9.65 8.30

8.74 7.66

8.76 capacitance 6.67

9.85 7.68

10.82 8.99

10.32 9.01

9.33 8.25

9.23 capacitance 7.5

10.37 8.56

11.34 9.76

10.78 9.61

9.74 8.75

9.58 capacitance 8.13

10.75 9.22

11.73 10.37

11.13 10.09

10.05 9.17

conductors when the mitigation loop is short circuited and about one meter above the outer phase conductors when an optimal capacitance inserted in the mitigation loop. (Tables 6.5 and 6.6) depict the effect of the mitigation loop spacing on the calculated magnetic field intensity at points P1 and P2, respectively, when the mitigation loop height is 21 m. It is seen that the optimal spacing is the outer phase conductors spacing. Figure 6.7 shows the comparison between the calculated magnetic field intensity values result from; the conductors, ground wires and short circuit mitigation loop; and the conductors, ground wires and mitigation loop with optimal capacitance and optimal parameters obtained from (Tables 6.3, 6.4, 6.5 and 6.6). It is seen that the magnetic field intensity decreased further more, maximum reduction of 8.0552% (at point P1) and 19.5326% (at point P2).

6 Mitigation of Magnetic Field Under Overhead Transmission Line

79

Table 6.5 The effect of the mitigation loop spacings on the calculated magnetic field intensity at point (P1) and 21 m height Distance of mitigation loop from the center phase Magnetic field (A/m) at P1 at distance from center phase equals 5m Short circuit With optimal 7.5 m Short circuit With optimal 10 m Short circuit With optimal 13.2 m Short circuit With optimal 15 m Short circuit With optimal

-15 m

-10 m

0m

10 m

15 m

capacitance

21.54 20.77

25.03 24.07

24.88 23.28

25.56 24.83

22.20 21.75

capacitance

20.43 18.65

23.48 21.21

23.24 20.67

24.15 22.73

21.22 20.26

capacitance

18.35 14.77

20.90 16.58

21.42 18.25

21.98 20.19

19.45 18.01

capacitance

14.19 7.01

17.06 9.56

19.26 15.87

19.01 18.15

16.13 15.40

capacitance

14.57 7.66

18.22 11.28

20.51 17.14

20.19 19.17

16.69 16.12

Table 6.6 The effect of the mitigation loop spacings on the calculated magnetic field intensity at point (P2) and 21 m height Distance of mitigation loop from the center phase Magnetic field (A/m) at P2 at distance from center phase equals 5m Short circuit With optimal 7.5 m Short circuit With optimal 10 m Short circuit With optimal 13.2 m Short circuit With optimal 15 m Short circuit With optimal

-15 m

-10 m

0m

10 m

15 m

capacitance

10.69 10.21

11.89 11.32

12.79 12.16

12.17 11.73

11.06 10.72

capacitance

10.06 9.01

11.14 9.94

11.96 10.73

11.46 10.57

10.47 9.77

capacitance

9.01 7.13

9.97 7.93

10.77 8.91

10.38 9.05

9.52 8.44

capacitance

7.09 3.98

7.99 4.88

8.93 6.67

8.61 7.25

7.84 6.76

capacitance

7.28 4.34

8.33 5.38

9.41 7.24

8.98 7.69

8.08 7.10

80 18 Magnetic Field Intensity (A/m)

Fig. 6.7 Comparison between the calculated magnetic field intensity values result from the conductors, ground wires and short circuit mitigation loop; and from the conductors, ground wires and mitigation loop with capacitance of optimal value at optimal height and spacing

A. Z. El Dein Mohammed Moussa

16

P1:Cond+Ground+Mit.SC

14

P2:Cond+Ground+Mit.SC

12

P1:Cond+Ground+Mit. Opt. C P2:Cond+Ground+Mit. Opt. C

10 8 6 4 2 0 –40

–30

–20

–10

0

10

20

30

40

Distance from the center phase (m)

6.5 Conclusion The effects of the currents in the subconductors of each phase bundle, the currents in the ground wires, the currents in the mitigation loop, and also the earth return currents; in the calculation of the magnetic field are investigated by using the MTL technique. Furthermore, the effect of the conductor’s sag between towers, and the effect of sag variation with the temperature on the calculated magnetic field is studied. Finally the passive loop conductor design parameters, for Egyptian 500 kV overhead transmission line, are obtained at ambient temperature (35°C).

References 1. International Association of Engineers [Online]. Available: http://www.iaeng.org 2. El Dein AZ (2010) Mitigation of magnetic field under Egyptian 500 kV overhead transmission line. Lecture notes in engineering and computer science: Proceeding of the World Congress on Engineering, vol II WCE 2010, 30 June–2 July 2010, London, UK, pp 956–961 3. Hossam-Eldin AA (2001) Effect of electromagnetics fields from power lines on living organisms. In: IEEE 7th International Conference on Solid Dielectrics, June 25–29, Eindhoven, The Netherlands, pp 438–441 4. Karawia H, Youssef K, Hossam-Eldin AA (2008) Measurements and evaluation of adverse health effects of electromagnetic fields from low voltage equipments. MEPCON Aswan, Egypt, March 12–15, pp 436–440 5. Dahab AA, Amoura FK, Abu-Elhaija WS (2005) Comparison of magnetic-field distribution of noncompact and compact parallel transmission-line configurations. IEEE Trans Power Deliv 20(3):2114–2118 6. Stewart JR, Dale SJ, Klein KW (1993) Magnetic field reduction using high phase order lines. IEEE Trans Power Deliv 8(2):628–636 7. Yamazaki K, Kawamoto T, Fujinami H (2000) Requirements for power line magnetic field mitigation using a passive loop conductor. IEEE Trans Power Deliv 15(2):646–651 8. Olsen RG, Wong P (1992) Characteristics of low frequency electric and magnetic fields in the vicinity of electric power lines. IEEE Trans Power Deliv 7(4):2046–2053

6 Mitigation of Magnetic Field Under Overhead Transmission Line

81

9. Begamudre RD (2006) Extra high voltage AC. Transmission Engineering, third Edition, Book, Chapter 7, Wiley Eastern Limited, pp 172–205 10. Brandão Faria JA, Almeida ME (2007) Accurate calculation of magnetic-field intensity due to overhead power lines with or without mitigation loops with or without capacitor compensation. IEEE Trans Power Deliv 22(2):951–959 11. de Villiers W, Cloete JH, Wedepohl LM, Burger A (2008) Real-time sag monitoring system for high-voltage overhead transmission lines based on power-line carrier signal behavior. IEEE Trans Power Deliv 23(1):389–395 12. Noda T (2005) A double logarithmic approximation of Carson’s ground-return impedance. IEEE Trans Power Deliv 21(1):472–479 13. Ramirez A, Uribe F (2007) A broad range algorithm for the evaluation of Carson’s integral. IEEE Trans Power Deliv 22(2):1188–1193 14. Benato R, Caldon R (2007) Distribution line carrier: analysis procedure and applications to DG. IEEE Trans Power Deliv 22(1):575–583 15. Overhead Cable Sag Calculation Program http://infocom.cqu.edu.au/Staff/Michael_O_malley/ web/overhead_cable_sag_calculator.html 16. El Dein AZ (2009) Magnetic field calculation under EHV transmission lines for more realistic cases. IEEE Trans Power Deliv 24(4):2214–2222

Chapter 7

Universal Approach of the Modified Nodal Analysis for Nonlinear Lumped Circuits in Transient Behavior Lucian Mandache, Dumitru Topan and Ioana-Gabriela Sirbu

Abstract Recent approaches for time-domain analysis of lumped circuits deal with differential-algebraic-equation (DAE) systems instead of SPICE-type resistive models. Although simple and powerful, DAE models based on modified nodal approaches require some restrictions related to redundant variables or circuit topology. In this context, the paper proposes an improved version that allows treating nonlinear analog circuits of any topology, including floating capacitors, magnetically coupled inductors, excess elements and controlled sources. The procedure has been implemented in a dedicated program that builds the symbolic DAE model and solves it numerically.

7.1 Introduction The transient analysis of analog nonlinear circuits requires a numerical integration that is commonly performed through associated discrete circuit models (SPICEtype models). In this manner, resistive circuits are solved sequentially at each time

L. Mandache (&) I.-G. Sirbu Faculty of Electrical Engineering, University of Craiova, 107 Decebal Blv, 200440, Craiova, Romania e-mail: [email protected] I.-G. Sirbu e-mail: [email protected] D. Topan Faculty of Electrical Engineering, University of Craiova, 13 A.I. Cuza Str, 200585, Craiova, Romania e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_7, Springer Science+Business Media B.V. 2011

83

84

L. Mandache et al.

step [1–3]. Different strategies involve building of state or semistate mathematical models, as differential or differential–algebraic equation systems [4–6]. It is solved by specific numerical methods without engaging equivalent circuit models. Therefore, the problem of circuit analysis is transferred to a pure mathematical one. The latter strategy was extended during the last decades, taking advantage of the progress of the information technology [7–12]. The paper is focused on a semistate equations-based method associated to the modified nodal approach. This method avoids singular matrices in the equation system and overcomes the restriction related to floating capacitors [1, 14]. It also benefits by our previously developed topological analysis based on a single connection graph [3, 13] instead of two or more graphs [2, 10], although the circuit contains controlled sources. A simple, robust and comprehensive method is obtained. The semistate mathematical model corresponding to the modified nodal approach (MNA) in the time domain has the general form of an ordinary differential equation system: Mðx; tÞ x_ ðtÞ þ Nðx; tÞ xðtÞ ¼ f ðx; tÞ ; ð7:1Þ xðt0 Þ ¼ x0 : The vector of circuit variables xðtÞ, with the initial value x0 , contains the vector of the node voltages vn1 and the vector of the branch currents im that can not be expressed in terms or node voltages and/or theirs first-order derivatives: vn1 ðtÞ xðtÞ ¼ : ð7:2Þ im ðtÞ Therefore, the vector im contains the currents of zero-impedance elements (socalled MNA-incompatible elements): independent and controlled voltage sources, the controlling currents of current controlled sources, the inductor currents and the currents of the current controlled nonlinear resistors [1, 3, 5, 13]. Mðx; tÞ and Nðx; tÞ are square and generally state and time dependent matrices containing the parameters of the nonlinear elements. The matrix M contains the inductances and capacitances of energy storage circuit elements (dynamic parameters for the nonlinear storage elements) while the matrix N contains the resistances and conductances of resistors (dynamic parameters for the nonlinear resistors). Since the matrix M is commonly singular, the mathematical model (7.1) requires a special treatment. f ðx; tÞ contains the circuit excitations and the parameters associated to the incremental sources used in the local linearization of the nonlinear resistors: current sources for voltage controlled nonlinear resistors and voltage sources for current controlled nonlinear resistors. Although the building of the mathematical model (7.1) is relatively simple, the existence of a unique solution is debatable in most cases (the problem of possible singularity of the matrix M has been already reported).

7 Universal Approach of the Modified Nodal Analysis

85

The paper is organized as follows: the Sect. 7.2 explains the problem of floating capacitors related to the existence and uniqueness of the solution of the Eq. 7.1; the improved version of the MNA, in order to obtain an equivalent well-posed equation, is described in Sect. 7.3 and an example is treated in the Sect. 7.4.

7.2 The Problem of Floating Capacitors Related to MNA The discussion on the floating capacitors requires building the circuit connection graph. We agree the single-graph procedure with its specific preliminary actions related to appropriate modeling of controlling ports of controlling sources [3]. Since the connection graph was built and the ground node was chosen, the capacitor subgraph is simply extracted. If the circuit capacitors subgraph is not connected, then redundant variables appear in Eq. 7.1 and the circuit response can not be computed, as shown below. The DAE mathematical model based on the MNA requires that any linear or nonlinear capacitor to be linked to the ground node through a path of capacitors. A capacitor that does not accomplish this requirement is called floating capacitor (see Fig. 7.1). The time-domain nodal equations for such a structure are: 8 X > ij ¼ 0 ; Ck v_ p Ck v_ q þ > > < j2ðpÞ ð7:3Þ X > > is ¼ 0 ; > Ck v_ p þ Ck v_ q þ : s2ðqÞ

where the state matrix Mk ¼

Ck Ck

Ck ; Ck

ð7:4Þ

is obviously singular. Therefore, such a mathematical model is inappropriate. If one of the equations (7.3) is replaced by the cutset current law expressed for the cutset R surrounding the floating capacitor: 8 P > Ck v_ p Ck v_ q þ ij ¼ 0 ; > > < j2ðpÞ P P ð7:5Þ ij þ is ¼0 ; > > > : j2ðpÞ s2ðqÞ j6¼k

s6¼k

then the singular matrix is avoided. The second equation in (7.5) can be obtained simply by adding both nodal equations (7.3). Nevertheless, the equation system (7.5) contains a redundant variable because two dynamic variables are involved by only one differential equation. Moreover, these two state variables correspond to

86

L. Mandache et al. Σ

Fig. 7.1 Floating capacitor

(p) ij

vp

ik

Ck

(q) vq

is

only one capacitor. If the capacitor is grounded, with vq ¼ 0, then the nodal equation associated to the node (p) becomes Ck v_ p þ

X

ij ¼ 0

ð7:6Þ

j2ðpÞ

and the problem of singular matrix or redundant variable does not appear. Extrapolating the above reasoning, if any subgraph of capacitors is floating, then the number of variables exceeds the number of essential capacitors, one variable being redundant. In order to overcome the problem of redundant variable, a change of variables will be performed: the node voltages will be replaced by the tree-branch voltages.

7.3 Improved Version of the MNA To overcome the problem of singular matrices and redundant variables introduced by the floating capacitors, our method requires accomplishing three main steps: Step 1 Step 2

Step 3

Build the modified nodal equations by ignoring the floating capacitor problem; Identify all subgraphs of floating capacitors, as well as the nodal equations related to theirs nodes [1, 3]; for each such a subgraph, replace one of the nodal equations by the cutset current law expressed for the cutset surrounding the subgraph, as the second Eq. 7.5; an equivalent mathematical model is obtained, with the general form similar to (7.1): v_ n1 ðtÞ vn1 ðtÞ 0 0 M _ þN ¼ f 0: ð7:7Þ im ðtÞ im ðtÞ Perform a change of variables: the vector of the node voltages vn1 is replaced by the vector of the tree-branch voltages ut , the vector im remaining unchanged.

7 Universal Approach of the Modified Nodal Analysis

87

As it is known, the MNA does not require finding a normal tree of the given circuit. Nevertheless, in order to perform the change of variables a normal tree is required. We developed previously a simple and efficient method to build normal trees systematically [3, 13], that requires only few preliminary adjustments in the circuit diagram, as: the controlling branches of voltage-controlled sources must be modeled by zero-independent current sources and the controlling branches of current-controlled sources must be modeled by zero-independent voltage sources. The magnetically coupled inductors need to be modeled through equivalent diagrams with controlled sources. Thus, the normal tree is necessary for identifying the excess capacitors and inductors. Since a normal tree was found, the step 3 of our algorithm can be performed. The node-branch incidence matrix is partitioned depending on the tree/cotree branches: A ¼ ½ At

j Ac ;

ð7:8Þ

where At corresponds to the tree branches and Ac corresponds to the cotree branches [3, 13]. Next, the tree-branch voltages may be expressed in terms of nodes voltages [2, 13] using the transpose of the square nonsingular matrix At : ut ¼ Att vn1 :

ð7:9Þ

Since the existence of the normal tree guarantees that the matrix At is always square and nonsingular, and consequently invertible, the node voltages of (7.9) can be expressed in terms of the tree-branch voltages: vn1 ¼ A0 ut ;

ð7:10Þ

where A0 signifies the inverse matrix of Att . It is noticeable that the inverse matrix A0 can be obtained relatively simple, due to the sparsity of At with the nonzero elements equal to þ1 or 1. Using (7.10) to substitute the vector vn1 in (7.7), the mathematical model becomes 0 0 u_ u A 0 A 0 _ t þ N0 t ¼ f0 M0 ð7:11Þ im im 0 1 0 1 or M 00 ðx0 ; tÞ x_ 0 ðtÞ þ N 00 ðx0 ; tÞ x0 ðtÞ ¼ f 0 ðx0 ; tÞ

ð7:12Þ

where obvious notations were used. The new vector of variables is x0 ðtÞ. We extract from x0 the essential capacitor voltages and the essential inductor currents, as elements of the state vector of length s (the subscript s comes from ‘‘state’’): u ð7:13Þ xs ¼ C : iL

88

L. Mandache et al.

The remained elements of x0 are grouped in the vector xa . The vector of variables organized as above involves splitting the equation system (7.12) as follows:

ð7:14Þ

We remark that only the partition Mss of size s s of the matrix M 00 is nonsingular, all other elements being zeros. A differential–algebraic equation system has been emphasized ( Mss x_ s ðtÞ þ Nss xs ðtÞ þ Nsa xa ðtÞ ¼ fs ðxs ; xa ; tÞ ; ð7:15Þ Nas xs ðtÞ þ Naa xa ðtÞ ¼ fa ðxs ; xa ; tÞ ; with the initial condition

uC ðt0 Þ xs ðt0 Þ ¼ : iL ðt0 Þ

ð7:16Þ

Therefore, the vector xs contains the variables of the differential equation system, while xa contains the variables of the algebraic equation system (the subscript a comes from ‘‘algebraic’’). In order to find the time-domain solution, many numerical techniques suitable for DAE can be used. In principle, the computation procedure requires the discretization of the analysis time and running the following steps: • Solve the algebraic equation from (7.15), assigning to the state variables the initial values: Nas xs ðt0 Þ þ Naa xa ¼ fa

ð7:17Þ

in order to find the solution xa ðt0 Þ. • Perform a numerical integration of the differential equation from (7.15), for the first discrete time interval ðt0 ; t1 Þ, assigning the value xa ðt0 Þ to the vector xa and considering (7.16) as initial condition: ( Mss x_ s þ Nss xs þ Nsa xa ðt0 Þ ¼ fs ; ð7:18Þ xs ðt0 Þ ¼ xs0 : The solution xs ðt1 Þ is obtained. • At the time step k the algebraic equation is solved, assigning to the state variables the values xs ðtk Þ calculated previously, during the numerical integration on the time interval ðtk1 ; tk Þ:

7 Universal Approach of the Modified Nodal Analysis

Nas xs ðtk Þ þ Naa xa ¼ fa :

89

ð7:19Þ

The solution xa ðtk Þ is found. • Perform a numerical integration of the differential equation, for the next discrete time interval ðtk ; tkþ1 Þ, assigning the previously computed value xa ðtk Þ to the vector xa , and considering as initial condition the values xs ðtk Þ: ( Mss x_ s þ Nss xs þ Nsa xa ðtk Þ ¼ fs ; ð7:20Þ xs ðtk Þ ¼ xsk : The solution xs ðtkþ1 Þ is obtained. The last two steps are repeated until the final moment of the analysis time is reached. It is noticeable that the efficiency of the iterative algorithms used for nonlinear algebraic equation solving is significantly enhanced if xa ðtk1 Þ is considered as start point. The above described method has been implemented in a computation program under the high performance computing environment MATLAB. It recognizes the input data stored in a SPICE-compatible netlist, performs a topological analysis in order to build a normal tree and incidence matrices, identifies the excess elements and floating capacitors, builds the symbolic mathematical model as in expression (7.15), solves it numerically and represents the solution graphically.

7.4 Example Let us study the transient behavior of an electromechanical system with a brushed permanent magnet DC motor supplied by a half wave uncontrolled rectifier, the mechanical load being nonlinear. The equivalent diagram built according to the transient model is shown in Fig. 7.2. There is not our goal to explain here the correspondence between the electromechanical system and the equivalent circuit diagram, or to judge the results from the point of view of its technical use. Only the algorithm described above will be emphasized. The diagram contains one floating capacitor (branch 18) and two nonlinear resistors (the current controlled nonlinear resistor of the branch 11 is the model of the nonlinear mechanical load, reproducing the speed-torque curve, and the voltage controlled nonlinear resistor of the branch 16 corresponds to the semiconductor diode), whose characteristics are shown in Fig. 7.3. The independent zero-current sources 9, 10 and 14 correspond to the controlling branches of the voltage controlled sources 5, 13 and 7 respectively, while the independent zerovoltage source 4 corresponds to the controlling branch of the current controlled current source of the branch 6. The circuit is supplied by the independent sinusoidal voltage source of the branch 1.

90

L. Mandache et al. 8 17

18

5 2

7 16

u9 = u14

2 14

9

1 3

6

7

B6,4 i4

19 3 9

G7,10 u10

10 6

4

15

u10

4

1

8

5

10

11 Ri

12

13

G13,9 u9

A5,9 u9 10

10

Permanent magnet DC motor

Fig. 7.2 Circuit example Diode / branch 16

Nonlinear res. / branch 11 0. 2

2 1. 5

0. 1

[A]

[V]

1 0

0. 5 -0. 1

-0. 2 -60

0

-40

-20

0

20

40

60

-0.5 -1

[A]

-0.5

0

0. 5

1

1. 5

2

[V]

Fig. 7.3 Nonlinear resistors characteristics

If the node 10 is grounded, the topological analysis performed by our computing program gets the result: The circuit does not contain excess inductors Normal tree branches: 1 4 5 8 15 18 16 2 12 MNA-incompatible branches: 1 3 4 5 11 Floating capacitor subgraph 1: reference_node: 8 other_nodes: 7 Therefore, the semistate variables are: xs ¼ ½u8 ; u15 ; u18 ; i3 t , and the variables of the algebraic equation system are: xa ¼ ½u1 ; u4 ; u5 ; u16 ; u2 ; u12 ; i1 ; i4 ; i5 ; i11 t . The computing program gets the mathematical model in the symbolic form of type (7.15):

7 Universal Approach of the Modified Nodal Analysis

91

• The differential equation system: 8 C15 Du15 ¼ G17 u18 ðG17 Gd16Þ u16 G2 u2 þ J0R16 > > > < C8 Du8 ¼ G7 10 u12 B6 4 i4 þ J9 þ J14 > C18 Du18 ¼ G19 u15 G19 u1 ðGd16 þ G19Þ u16 J0R16 > > : L3 Di3 ¼ u15 þ u4 u5 þ u2 • The algebraic equation system: 8 i3 G2 u2 ¼ 0 > > > > > i3 þ i4 ¼ 0 > > > > > > i4 i5 ¼ 0 > > > > > G13 14 u8 þ G12 u12 þ i11 þ J10 ¼ 0 > > > < G19 u15 G17 u18 þ G19 u1 þ ðG17 þ Gd16 þ G19Þ u16 þ J0R16 ¼ 0 > G19 u15 G19 u1 G19 u16 i1 ¼ 0 > > > > > u1 þ E sin 1 ¼ 0 > > > > > u4 þ E4 ¼ 0 > > > > > A5 9 u8 þ u5 ¼ 0 > > > : u12 Rd11 i11 E0R11 ¼ 0

Since the mathematical model above is given by the computing program automatically, some unobvious notations are used (e.g. the first derivative of the state variable u15 – Du15; the conductance of the nonlinear resistance of the branch 16 – Gd16; the incremental current source used in the local linearization of the nonlinear voltage-controlled nonlinear resistance of the branch 16 – J0R16; the voltage gain of the voltage-controlled voltage source of the branch 5 controlled by the branch 9 – A5 9). Assuming zero-initial conditions, the solving algorithm gets the result as timedomain functions, some examples being given in Fig. 7.4. Although the analysis time was 800 ms in order to cover the slowest component of the transient response, only details for the first 100 ms are shown in Fig. 7.4. The DAE system has been solved using a Gear’s numerical integration algorithm with variable time step combined with a Newton–Raphson algorithm. With the computation errors restricted to the limit values of 107 (absolute value) and 104 (relative value), the time step of the numerical integration process was maintained between 47 ns and 558 ls. We remark that the same results have been obtained through a witness SPICE simulation, using the version ICAP/4 from Intusoft [15].

L. Mandache et al. 12

15

10

10

8

5

u15 [V]

u8 [V]

92

6

0

4

-5

2

-10

0

0

0. 02

0. 04

0. 06

0. 08

-15

0. 1

0

0. 02

Time [s]

0. 04

0. 06

0. 08

0. 1

0. 08

0. 1

Time [s] 4

5 0

3

-10

i3 [A]

u18 [V]

-5

-15

2 1

-20 0 -25 -30

0

0. 02

0. 04

0. 06

0. 08

Time [s]

0. 1

-1

0

0. 02

0. 04

0. 06

Time [s]

Fig. 7.4 Example of analysis results

7.5 Conclusion An efficient and totally feasible algorithm intended to the time-domain analysis of nonlinear lumped analog circuits was developed and implemented in a computation program. It overcomes some restrictions of the modified nodal approaches, having practically an unlimited degree of generality for RLCM circuits. The algorithm benefits by the simplicity of the MNA and the numerical methods for solving the mathematical model are flexible and can be optimized without requiring any companion diagrams (as the SPICE-like algorithms). In this manner, the computation time and the computer requirements can be reduced as compared to other methods. Our contribution is proven by an example, the chosen circuit containing nonlinear resistors, floating capacitors and controlled sources. Acknowledgments This work was supported in part by the Romanian Ministry of Education, Research and Innovation under Grant PCE 539/2008.

7 Universal Approach of the Modified Nodal Analysis

93

References 1. Mandache L, Topan D (2010) Improved modified nodal analysis of nonlinear analog circuits in the time domain. Lecture notes in engineering and computer science, vol 2184— Proceedings of the World Congress on Engineering – London UK, June 30–July 2, pp 905–908 2. Chua LO, Lin PM (1975) Computer-aided analysis of electronic circuits–algorithms and computational techniques. Prentice-Hall, Englewood Cliffs 3. Iordache M, Mandache L (2004) Computer-aided analysis of nonlinear analog circuits (original title in Romanian) Ed. Politehnica Press, Bucharest (in Romanian) 4. Hodge A, Newcomb R (2002) Semistate theory and analog VLSI design. IEEE Circuits Syst Mag Second Quart 2(2):30–51 5. Newcomb R (1981) The semistate description of nonlinear time-variable circuits. IEEE Tram Circuits Syst CAS-28(1):62–71 6. Ho CW, Ruehli AE, Brennan PA (1975) The modified nodal approach to network analysis. IEEE Tram Circuits Syst CAS-22:504–509 7. Yamamura K, Sekiguchi T, Inoue Y (1999) A fixed-point homotopy method for solving modified nodal equations. IEEE Trans Circuits Syst - I: Fundam Theory Appl 46(6):654–665 8. Brambilla A, Premoli A, Storti-Gajani G (2005) Recasting modified nodal analysis to improve reliability in numerical circuit simulation. IEEE Trans Circuits Syst I: Regul Pap 52(3):522–534 9. Lee K, Park SB (1985) Reduced modified nodal approach to circuit analysis. IEEE Trans Circuits Syst 32(10):1056–1060 10. Chang FY (1997) The unified nodal approach to circuit analysis. In: IEEE International Symposium on Circuits and Systems, June 9–12, 1997, Hong Kong, pp 849–852 11. Hu JD, Yao H (1988) Generalized modified nodal formulation for circuits with nonlinear resistive multiports described by sample data. In: IEEE International Symposium on Circuits and Systems, vol 3, 7–9 June 1988, pp 2205–2208 12. Kang Y, Lacy JG (1992) Conversion of MNA equations to state variable form for nonlinear dynamical circuits. Electron Lett 28(13):1240–1241 13. Topan D, Mandache L (2007) Special matters of circuit analysis (original title in Romanian). Universitaria, Craiova (in Romanian) 14. Mandache L, Topan D (2003) An extension of the modified nodal analysis method. In: European Conference on Circuit Theory and Design ECCTD ‘03, September 1–4 2003, Kraków, pp II-410–II-413 15. *** ICAP/4—Is SPICE 4 User’s guide (1998) Intusoft. San Pedro, California USA

Chapter 8

Modified 1.28 Tbit/s (32 3 4 3 10 Gbit/s) Absolute Polar Duty Cycle Division Multiplexing-WDM Transmission Over 320 km Standard Single Mode Fiber Amin Malekmohammadi

Abstract A new version of Absolute Polar Duty Cycle Division Multiplexing transmission scheme over Wavelength Division Multiplexing system is proposed. We modeled and analyzed a method to improve the performance of AP-DCDM over WDM system by using Dual-Drive Mach–Zehnder-Modulator (DD-MZM). Almost 4.1 dB improvement in receiver sensitivity of 1.28 Tbit/s (32 9 40 Gbit/s) AP-DCDM-WDM over 320 km fiber is achieved by optimizing the bias voltage in DD-MZM.

8.1 Introduction Wave length division multiplexing technologies have enabled the achievement of ultra high capacity transmission over 1 Tbit/s using Erbium Doped Fiber Amplifier (EDFA’s). To pack a Tbit/s capacity into the gain bandwidth, spectral efficiency has to be improved. Narrow filtering characteristics and a high stability for the center frequency of optical filters are required to achieve dense WDM systems. Although such narrow optical filters could be developed [1, 2], narrow filtering of the signal light would result in wave form distortion in the received signal. Thus compact spectrum signals are also required for reducing distortion due to narrow filtering. Absolute Polar Duty Cycle Division Multiplexing (AP-DCDM) is an alternative multiplexing technique which is able to support many users per WDM channel

A. Malekmohammadi (&) Department of Electrical and Electronic Engineering, The University of Nottingham, Malaysia Campus, Kuala Lumpur, Malaysia e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_8, Ó Springer Science+Business Media B.V. 2011

95

96

A. Malekmohammadi

[3, 4]. Therefore, as reported in [4] the capacity of the WDM channels can be increased tremendously by using this technique. AP-DCDM enables us to use narrow optical filters that will provide spaces to increase the channel count. AP-DCDM system has intrinsic sensitivity penalty as compared to the binary signal, due to fragmentation of the main eye to smaller eyes [3]. At the same received power, these small eyes have different quality; therefore cause different AP-DCDM channels to have different performances, which is not desirable in telecommunication systems [3, 4]. In this paper, Dual-Drive Mach–Zehnder-Modulator (DD-MZM) is used in AP-DCDM-WDM setup at 1.28 Tbit/s transmission systems in order to improve the performance of AP-DCDM-WDM transmission system. It is shown that by optimum adjustment of the bias voltage at both ports, the sensitivity of the worst channel in AP-DCDM in 1.28 Tbit/s AP-DCDM-WDM over 320 km SSMF can be improved by 4.1 dB. Mach–Zehnder modulators have the important feature that the chirp of the transmitted signal is a function of the electro optic properties of the p-i-n waveguide, the splitting ratios of the two branch waveguides, the differential length between the two arms of the interferometer, and the format of the modulating voltages applied to the arm electrodes [5–7]. An important property of DD–MZM is that, due to the quantum confined stark effect, the attenuation and phase constants of an optical signal propagating in the p-i-n waveguide are nonlinear functions of the applied voltage. Since these constants determine the modulator extinction ratio and chirp, the bias and modulation voltages can be optimized to yield the minimum degradation in receiver sensitivity due to fiber dispersion and self-phase modulation [6, 8].

8.2 Conventional 32-Channel AP-DCDM-WDM Transmission As shown in Fig. 8.1 the evaluation starts with four AP-DCDM channels (4 9 10 Gbit/s) with PRBS of 210-1 (Fig. 8.1a) and followed by 32 WDM channels (32 9 4 9 10 Gbit/s) (Fig. 8.1b). In Fig. 8.1 four OOK channels were multiplexed by using AP-DCDM, whose outputs are multiplexed by using WDM technique (each WDM channel contains 4 9 10 Gbit/s with PRBS of 210-1 as shown in Fig. 8.1a). 62.5 GHz (0.5 nm) channel spacing was used. As a result, 128 AP-DCDM channels (32 9 4) are multiplexed in 32 WDM channels (k1 to k32) within *15.5 nm (1550–1565.5 nm) EDFA band. WDM spectral efficiency of 0.64 bit/s/ Hz was achieved without polarization multiplexing [7]. The transmission line was 4 spans of 80 km Standard Single Mode Fiber (SSMF) followed by a 13.4 km Dispersion Compensation Fiber (DCF). The length ratio between SSMF and DCF is optimized so that the overall second-order dispersion reaches zero. For the SSMF, the simulated specifications for dispersion (D), dispersion slop (S), attenuation coefficient (a), effective area (Aeff) and nonlinear index of refraction (n2) are

8 Absolute Polar Duty Cycle Division Multiplexing

97

Fig. 8.1 a 4 9 10 Gbit/s AP-DCDM transmission system. b Simulation setup of 1.28 Tbit/s (32 9 4 9 10 Gbit/s) AP-DCDM-WDM transmissions. c Optical spectrum before transmission. d Optical spectrum after transmission. e Single channel AP-DCDM spectrum

16.75 ps/nm/km, 0.07 ps/nm2/km, 0.2 dB/km, and 80 lm2 and 2.7 9 10-20 m2/W respectively. For DCF, D of * -100 ps/nm/km, S of -0.3 ps/nm2/km, a of 0.5 dB/km, Aeff of 12 lm2 and n2 of 2.6 9 10-20 m2/W are used. For Booster and pre-amplifier, an erbium-doped fiber amplifier (EDFA) with a flat gain of 30 dB and a noise figure (NF) of 5 dB was used. The total power to the booster is 8.35 dBm and launch power into SSMF is +15 dBm (*0 dBm/channel). The Self Phase Modulation (SPM) effect in the link could be neglected since the launched power into the SSMF and DCF was less than the SPM threshold.

98

A. Malekmohammadi

Figure 8.1c, d show the optical spectra of 32 WDM channels before and after transmission respectively. The effect of Four Wave Mixing (FWM) is negligible due to the phase mismatch in the highly dispersive transmission line [9, 10]. Figure 8.2a shows the exemplary eye diagrams taken after the 320 km SSMF (4 span of 80 km SSMF ? 13.4 km DCF) for the worst channel (channel 16) of WDM system which contains 4 9 10 Gb/s AP-DCDM. As illustrated in Fig. 8.2 and reported in [7], the generated eye diagram for channel 16 which contains 4-channel of AP-DCDM system contains 6 small eyes. Eyes 1, 2, 3 and 4 (slots 1 and 2) correspond to the performance of AP-DCDM channel 1, eyes 2, 4 and 5 (slots 2 and 3) are related to performance of AP-DCDM channel 2, eyes 5 and 6 (slots 3 and 4) influence the performance of AP-DCDM channel 3, and eye 6 (slot 4) is related to AP-DCDM channel 4. As illustrated in Fig. 8.2a, at -25 dBm received power, Q-factor of all four eyes located at the first level is more than 6, which are higher than that of the eyes located at the second level (around 3.6 and 3.8 for eyes 1 and 2, respectively). The eye openings at different levels are almost similar but have different Q-factors due to different standard deviation of the noise variation at each level. Therefore, at the same received power, channel with minimum variation of noise has the best performance (e.g. channel 4) and the channel with maximum variation has the worst performance (channel 1).

Fig. 8.2 a Received eye diagram for channel 16 in 32-channel AP-DCDM-WDM system. b Received eye diagram for channel 16 in 32-channel AP-DCDM-WDM system using optimized DD-MZM

8 Absolute Polar Duty Cycle Division Multiplexing

99

8.3 Dual Drive-Mach–Zehnder Modulator The Dual Drive-Mach–Zehnder modulator consists of an input Y-branch (splitter), two arms with independent drive electrodes, and an output Y-branch (combiner). The optical signal incident on the input Y-branch is split into the two arms of the interferometer. When the signals recombine at the output Y-branch, the on-state is achieved when there is no differential phase shift between the two signals and the off-state is achieved when there is a differential phase shift of radians. The total optical field at the output of the Y-branch combiner is, to a good approximation, the sum of the fields at the outputs of the two arms. If the splitting ratios of the input and output Y-branches are identical, the output of the modulator is given by [6] D aa ðV1 Þ E0 SR: exp þ jDbðV1 Þ L 1 þ SR 2 D aa ðV2 Þ þ jDbðV2 Þ L j U0 þ exp 2 pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ¼ IðV1 ; V2 Þ expðj u0 ðV1 ; V2 ÞÞ

EðV1 ; V2 Þ ¼

where SR = P1/P2 is a Y-branch power splitting ratio; Daa/2 is attenuation constant; Db, phase constant; U0, ‘0’ radian for conventional modulator and ‘X’ radians for a X phase shift modulator; V1 and V2 are voltages applied to arms 1 and 2 respectively; I is the intensity of the optical signal; and U is the phase. For i = 1, 2 Vi ðtÞ ¼ Vbi þ Vmodi vðtÞ where Vbi is the bias voltage; Vmodi peak-to-peak modulation voltage; V(t) modulation waveform with a peak-to-peak amplitude of one and an average value of zero. The dependence of the attenuation and phase constants on the applied voltage can be obtained either by direct measurement of a straight section of waveguide cut from one arm of a modulator [5] or by using measurements of the voltage dependence of the intensity of the output signal for each arm with the other arm strongly absorbing [6–8]. Referring to Sect. 8.2, the improvement in the system performance can be obtained by having optimum amplitude distribution among the AP-DCDM signal level. This can be achieved by optimization in amplitude control of the level. To satisfy that requirement, we implement DD-MZM, which consists of an input Y-branch splitter, two arms with independent drive electrodes, and an output Y-branch combiner, in our setup as a replacement to conventional single-drive amplitude modulator (AM).

100 Table 8.1 DD-MZM optimization process for (a) Vb2, (b) Vb1

A. Malekmohammadi Setup (a) Conventional AP-DCDM MZM, Vb2 = -1 v MZM, Vb2 = -0.8 v MZM, Vb2 = -0.6 v MZM, Vb2 = -0.4 v (b) MZM, Vb1 = -3 v MZM, Vb1 = -2.9 v MZM, Vb1 = -2.8 v MZM, Vb1 = -2.6 v

Q1

Q2

Q3

Q4

Q5

Q6

3.6 4.4 5.1 5.9 6.4

3.8 4.7 5.4 6.1 5.9

6.7 6.3 6.2 6.1 6.9

6.7 6.5 6.4 6.3 6.1

6.7 6.4 6.2 6.2 6

6.4 6.1 6.1 6 5.8

6.5 5.9 5.5 4.5

5.5 6.1 5.8 5

7 6.1 6.5 6.5

6.5 6.3 6.3 6.5

6 6.2 6.2 6.2

5.5 6 6 6

8.4 Optimizing the DD-MZM for 1.2 Tbit/s AP-DCDM-WDM Transmission As discussed in Sect. 8.2 we need to have almost similar Q-factor for all 6 eyes to achieve similar performance for all channels. This can be done by improving the eye quality in second level. In order to change the eye high in second level while maintaining the maximum power, the bias voltage 1 (Vb1) and voltage 2 (Vb2) in DD-MZM need to be optimized so that the eye high in first level is reduced while increasing the eye high of the second level. The optimum bias voltages are considered for two different conditions for the worst channel in 32 channel AP-DCDM-WDM system (Channel 16) as shown in Table 8.1. The dependence of Q-factor for all 6 eyes on the Vb2 is shown in Table 8.1a (top) at the fixed received power of -25 dBm (receiver sensitivity of best channel), fixed Vb1(-2.9) and splitting ratio of 1.3. It can be seen from Table 8.1a that the optimum Vb2 is around -0.6 V where eye1 to eye 6 have almost similar Q-factors of 5.9, 6.1, 6.1, 6.3, 6.2, 6 respectively. The variation of Q-factor for different values of the Vb1 with fixed Vb2 (-2.9) is shown in Table 8.1b. As illustrated in Table 8.1b, the optimum Vb1 is around 2.9 where all eyes have similar Q-factor. Referring to Table 8.1 under optimized voltage biased conditions, the variation in the Q-factor is quite small and it is expected that the optimum sensitivity is essentially similar for all multiplexed channels [11].

8.5 32 Channels AP-DCDM-WDM System Performance Using Optimized DD-MZM The simulation results are obtained by replacing AM in Fig. 8.1 by optimized DD-MZM for all 32 channels. The optimized DD-MZM was fixed with splitting

8 Absolute Polar Duty Cycle Division Multiplexing

101

Fig. 8.3 Pre-amplified receiver sensitivity versus signal wavelength for 32 channels

ratio (SR) of 1.3, Vb1 of -2.9 V and Vb2 of -0.6 V. Figure 8.2b shows the exemplary eye diagrams taken after the 320 km SSMF (4 span of 80 km SSMF ? 13.4 km DCF) for Channel 16. As illustrated, although the eye highs are different, the Q-factors are almost the same. Compared to AP-DCDM with AM, Q-factors related to the second level are greatly improved (from 3.6 and 3.8 to 5.9 and 6.1 for eye 1 and 2 respectively). Note that the maximum amplitude values for AM and DD-MZM eye diagrams are the same. By improving the quality of the second level eyes, the performance of worse users (user 1 and 2) in middle WDM channel (Ch. 16) is significantly improved. In addition to that, we can have almost the same performance for all channels. Figure 8.3 shows and compares the receiver sensitivity of both AP-DCDMWDM with AM and the one with optimized DD-MZM for all 32 channels. The degradation of receiver sensitivity is caused by the accumulated spontaneous emission light from each LD through the multiplexing process and by noise figure (NF) of the pre-amplifier. As shown in Fig. 8.3, the receiver sensitivity was around -21 dBm for conventional AP-DCDM-WDM system and the variation between the channels was around 1.5 dB. As illustrated in Fig. 8.3 the receiver sensitivity of proposed AP-DCDM-WDM system was improved to around -25.1 dBm compare to conventional AP-DCDM-WDM system. Therefore the proposed solution improves the receiver sensitivity by around 4.1 dB. Figure 8.4 shows the improvement of OSNR for proposed AP-DCDM-WDM system compare to conventional AP-DCDM-WDM at BER of 10-9. The reason

102

A. Malekmohammadi

Fig. 8.4 OSNR versus signal wavelength for 32 channels

for this receiver sensitivity and OSNR improvement can be realized by looking and comparing the received eye diagrams depicted in Fig. 8.2a, b.

8.6 Conclusion We have presented the performance of 1.28 Tbit/s AP-DCDM over WDM technique when drive voltages of DD-MZM are optimized. In comparison to the previous report [7], considerable receiver sensitivity improvement (4.1 dB) was achieved. The improvement is due to the eye high increment, which leads towards Q-factor enhancement. These results are impactful in the exploration for the optimum AP-DCDM transmission system.

References 1. Kim H, Essiambre R-J (2003) Transmission of 8 9 20 Gb/s DQPSK signals over 310-km SMF with 0.8-b/s/Hz spectral efficiency. IEEE Photon Technol Lett 15(5):769–771 2. Winzer P, Essiambre R (2006) Advance modulation formats for high-capacity optical transport networks. J Lightw Technol 24:4711–4728 3. Malekmohammadi A, Abdullah MK, Abas AF, Mahdiraji GA, Mokhtar M (2009) Analysis of RZ-OOK over absolute polar duty cycle division multiplexing in dispersive transmission medium. IET Optoelectron 3(4):197–206

8 Absolute Polar Duty Cycle Division Multiplexing

103

4. Malekmohammadi A, Abas AF, Abdullah MK, Mahdiraji GA, Mokhtar M, (2009) Realization of high capacity transmission in fiber optic communication systems using absolute polar duty cycle division multiplexing (AP-DCDM) technique. Opt Fiber Technol 15(4):337–343 5. Cartledge C (1999) Optimizing the bias and modulation voltages of MQW Mach–Zehnder modulators for 10 Gb/s transmission on nondispersion shifted fiber. J Light Tech 17: 1142–1151 6. Adams DM, Rolland C, Fekecs A, McGhan D, Somani A, Bradshaw S, Poirier M, Dupont E, Cremer E, Anderson K (1998) 1.55 lm transmission at 2.5 Gbit/s over 1102 km of NDSF using discrete and monolithically integrated InGaAsP/InP Mach–Zehnder modulator and DFB laser. Electron Lett 34:771–773 7. Malekmohammadi A, Abas AF, Abdullah MK, Mahdiraji GA, Mokhtar M, Rasid M (2009) AP-DCDM over WDM system. Opt Commun 282:4233–4241 8. Hoshida T, Vassilieva O, Yamada K, Choudhary S, Pecqueur R, Kuwahara H (2002) Optimal 40 Gb/s modulation formats for spectrally efficient long-haul DWDM system. IEEE J Lightw Tech 20(12):1989–1996 9. Winzer PJ, Chandrasekhar S, Kim H (2003) Impact of filtering on RZ-DPSK reception. IEEE Photon Technol Lett 15(6):840–842 10. Suzuki S, Kawano Y, Nakasha Y (2005) A novel 50-Gbit/s NRZ-RZ converter with retiming function using Inp-HEMT technology. In: Presented at the Compound Semiconductor Integrated Circuit Symposium 11. Malekmohammadi A, Abdullah MK, Abas AF (2010) Performance, enhancement of AP-DCDM over WDM with dual drive Mach–Zehnder-Modulator in 1.28 Tbit/s optical fiber communication systems. Lecture notes in engineering and computer science: Proceedings of the World Congress on Engineering 2010, WCE 2010, 30 June–2 July, London, UK, pp 948–951

Chapter 9

Wi-Fi Wep Point-to-Point Links Performance Studies of IEEE 802.11a, b, g Laboratory Links J. A. R. Pacheco de Carvalho, H. Veiga, N. Marques, C. F. Ribeiro Pacheco and A. D. Reis

Abstract The importance of wireless communications has been growing. Performance is a crucial issue, resulting in more reliable and efficient communications. Security is equally important. Laboratory measurements are made about several performance aspects of Wi-Fi (IEEE 802.11a, b, g) WEP point-to-point links. A contribution is given to performance evaluation of this technology, using two types of access points from Enterasys Networks (RBT-4102 and RBTR2). Detailed results are presented and discussed, namely at OSI levels 4 and 7, from TCP, UDP and FTP experiments: TCP throughput, jitter, percentage datagram loss and FTP transfer rate. Comparisons are made to corresponding results obtained for open links. Conclusions are drawn about the comparative performance of the links.

J. A. R. P. de Carvalho (&) C. F. Ribeiro Pacheco A. D. Reis Unidade de Detecção Remota, Universidade da Beira Interior, 6201-001 Covilhã, Portugal e-mail: [email protected] C. F. Ribeiro Pacheco e-mail: [email protected] A. D. Reis e-mail: [email protected] H. Veiga N. Marques Centro de Informática, Universidade da Beira Interior, 6201-001 Covilhã, Portugal e-mail: [email protected] N. Marques e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_9, Ó Springer Science+Business Media B.V. 2011

105

106

J. A. R. P. de Carvalho et al.

9.1 Introduction Wireless communications are increasingly important for their versatility, mobility, speed and favourable prices. It is the case of microwave and laser based technologies, e.g. Wi-Fi (Wireless Fidelity) and FSO (Free Space Optics), respectively. The importance and utilization of Wi-Fi have been growing for complementing traditional wired networks. Wi-Fi has been used both in ad hoc mode, for communications in temporary situations e.g. meetings and conferences, and infrastructure mode. In this case, an AP (Access Point) is used to permit communications of Wi-Fi devices with a wired based LAN (Local Area Network) through a switch/router. In this way a WLAN, based on the AP, is formed which is known as a cell. A WPAN (Wireless Personal Area Network) arises in relation to a PAN (Personal Area Network). Point-to-point and point-to-multipoint configurations are used both indoors and outdoors, requiring specialized directional and omnidirectional antennas. Wi-Fi uses microwaves in the 2.4 and 5 GHz frequency bands and IEEE 802.11a, 802.11b and 802.11g standards [1]. Due to increasing used of 2.4 GHz band, interferences increase. Then, the 5 GHz band has received considerable interest, although absorption increases and ranges are shorter. Nominal transfer rates up to 11 (802.11b) and 54 Mbps (802.11a, g) are specified. CSMA/CA is the medium access control. Wireless communications, wave propagation [2, 3] and WLAN practical implementations [4] have been studied. Detailed information is available about the 802.11 architecture, including performance analysis of the effective transfer rate, where an optimum factor of 0.42 was presented for 11 Mbps point-to-point links [5]. Wi-Fi (802.11b) performance measurements are available for crowded indoor environments [6]. Performance has been a very important issue, giving more reliable and efficient communications. New telematic applications are specially sensitive to performances, when compared to traditional applications. Application characterization and requirements have been discussed e.g. for voice, Hi Fi audio, video on demand, moving images, HDTV images, virtual reality, interactive data, static images, intensive data, supercomputation, electronic mail, and file transfer [7]. E.g. requirements have been presented for video on demand/moving images (1–10 ms jitter and 1–10 Mbps throughputs) and for Hi Fi stereo audio (jitter less than 1 ms and 0.1–1 Mbps throughputs). Wi-Fi microwave radio signals can be easily captured by everyone. WEP (Wired Equivalent Privacy) was initially intended to provide confidentiality comparable to that of a traditional wired network. In spite of its weaknesses, WEP is still widely used in Wi-Fi communications for security reasons. A shared key for data encryption is involved. In WEP, the communicating devices use the same key to encrypt and decrypt radio signals. Several performance measurements have been made for 2.4 and 5 GHz Wi-Fi open links [8–10], as well as very high speed FSO [11]. In the present work further Wi-Fi (IEEE 802.11a, b, g) results arise, using WEP, through OSI levels 4 and 7.

9 Wi-Fi Wep Point-to-Point Links

107

Performance is evaluated in laboratory measurements of WEP point-to-point links using available equipments. Comparisons are made to corresponding results obtained for open links. Conclusions are drawn about the comparative performance of the links. The rest of the paper is structured as follows: Chap. 2 presents the experimental details i.e. the measurement setup and procedure. Results and discussion are presented in Chap. 3. Conclusions are drawn in Chap. 4.

9.2 Experimental Details Two types of experiments were carried out, which are referred as Exp-A and Exp-B. In the measurements of Exp-A we used Enterasys RoamAbout RBT-4102 level 2/3/4 access points (mentioned as AP-A), equipped with 16–20 dBm IEEE 802.11a/b/g transceivers and internal dual-band diversity antennas [12], and 100-Base-TX/10-Base-T Allied Telesis AT-8000S/16 level 2 switches [13]. The access points had transceivers based on the Atheros 5213A chipset, and firmware version 1.1.51. They were parameterized and monitored through both the console using CLI (Command Line Interface) and a HTTPS (Secure HTTP) incorporated server. The configuration was for minimum transmitted power and equivalent to point-to-point, LAN to LAN mode, using the internal antenna. For the measurements of Exp-B we used Enterasys RoamAbout RBTR2 level 2/3/4 access points (mentioned as AP-B), equipped with 15 dBm IEEE 802.11a/b/g cards [12], and 100-Base-TX/10-Base-T Allied Telesis AT-8000S/16 level 2 switches [13]. The access points had RBTBH-R2 W radio cards similar to the Agere-Systems model 0118 type, and firmware version 6.08.03. They were parameterized and monitored through both the console and the RoamAbout AP Manager software. The configuration was for minimum transmitted power i.e. micro cell, point-to-point, LAN to LAN mode, using the antenna which was built in the card. Interference free channels were used in the communications. This was checked through a portable computer, equipped with a Wi-Fi 802.11a/b/g adapter running NetStumbler software [14]. WEP encryption was activated, using 128 bit encryption and a shared key for data encryption composed of 13 ASCII characters. No power levels above the minimum were required, as the access points were very close. Both types of experiments, Exp-A and Exp-B, were made using a laboratory setup, which has been planned and implemented as shown in Fig. 9.1. At OSI level 4 measurements were made for TCP connections and UDP communications, using Iperf software [15], permitting network performance results to be recorded. For a TCP connection, TCP throughput was obtained. For a UDP communication with a given bandwidth parameter, UDP throughput, jitter and percentage loss of datagrams were obtained. TCP packets and UDP datagrams of 1470 bytes size were used. A window size of 8 kbytes and a buffer size of the same value were used for TCP and UDP, respectively. In Fig. 9.1, one PC having

108

J. A. R. P. de Carvalho et al.

Fig. 9.1 Experimental laboratory setup scheme

IP 192.168.0.2 was the Iperf server and the other, with IP 192.168.0.6, was the Iperf client. Jitter, which represents the smooth mean of differences between consecutive transit times, was continuously computed by the server, as specified by RTP (Real Time Protocol) in RFC 1889 [16]. The same scheme was used for FTP measurements, where FTP server and client applications were installed in the PCs with IPs 192.168.0.2 and 192.168.0.6, respectively. The PCs were portable computers running Windows XP. They were set up to make available maximum resources to the present work. Also, batch command files were written to enable the TCP, UDP and FTP tests. The results were obtained in batch mode and written as data files to the client PC disk. Each PC had a second network adapter, to permit remote control from the official IP Remote Detection Unit network, via switch.

9.3 Results and Discussion Both access points AP-A and AP-B were configured with various fixed transfer rates for every one of the standards IEEE 802.11b (1, 2, 5.5 and 11 Mbps), 802.11g and 802.11a (6, 9, 12, 18, 24, 36, 48 and 54 Mbps). At OSI level 1 (physical layer), for every one of the cases, the local and remote values of the signal to noise ratios SNR were recorded. The best SNR levels were observed for 802.11g and 802.11a. Performance measurements, using TCP connections and UDP communications at OSI level 4 (transport layer), were carried out for both Exp-A and Exp-B. In each experiment, for every standard and nominal fixed transfer rate, an average TCP throughput was determined from several experiments. This value was used as the bandwidth parameter for every corresponding UDP test, giving average jitter and average percentage datagram loss. The results are shown in Figs. 9.2, 9.3 and 9.4.

9 Wi-Fi Wep Point-to-Point Links

109

Fig. 9.2 TCP throughput results versus technology and nominal transfer rate; Exp-A and Exp-B

Figure 9.2 shows the results from Exp-A and Exp-B, where polynomial fits were made to the TCP throughput data for each AP implementation of IEEE 802.11a, b, g, where R2 is the coefficient of determination. It follows that the best TCP throughputs are, by descending order, for 802.11a, 802.11g and 802.11b. In Exp-A (Fig. 9.2), the data for 802.11a are on average 32.6% higher than for 802.11g. The average values are 13.10 ± 0.39 Mbps for 802.11a, and 9.62 ± 0.29 Mbps for 802.11g. These values are in good agreement with those obtained for the same AP type and open links (13.19 ± 0.40 Mbps and 9.97 ± 0.30 Mbps for 802.11a and 802.11g, respectively) [9]. For 802.11b, the average value is 2.55 ± 0.08 Mbps. Also, the 802.11b data for 5.5 and 11 Mbps (average 4.05 ± 0.12 Mbps) are in good agreement with those obtained for the same AP type and open links (4.08 ± 0.12) [9]. In Exp-B (Fig. 9.2), the data for 802.11a are on average 2.9% higher than for IEEE 802.11g. The average values are 12.97 ± 0.39 Mbps for 802.11a, and 12.61 ± 0.38 Mbps for 802.11g. These values are in good agreement with those obtained for the same AP type and open links (12.92 ± 0.39 Mbps and 12.60 ± 0.38 Mbps for 802.11a and 802.11g, respectively) [9]. For 802.11b, the average value is 2.42 ± 0.07 Mbps.

110

J. A. R. P. de Carvalho et al.

Fig. 9.3 UDP—jitter results versus technology and nominal transfer rate; Exp-A and Exp-B

Also, the 802.11b data for 5.5 and 11 Mbps (average 3.88 ± 0.12 Mbps) are in good agreement with those obtained for the same AP type and open links (3.84 ± 0.12) [9]. The best TCP throughput performance was for AP-B. For both Exp-A and Exp-B, in Figs. 9.3 and 9.4 the data points representing jitter and percentage datagram loss, respectively, were joined by smoothed lines. In Exp-A (Fig. 9.3) the jitter data are on average lower for 802.11a (1.9 ± 0.1 ms) than for 802.11g (2.6 ± 0.1 ms). Similar trends were observed for the same AP type and open links (1.3 ± 0.1 ms for 802.11a and 2.4 ± 0.1 ms for 802.11g) [9]. For 802.11b, the average value is 4.8 ± 0.3 ms. Also, the 802.11b data for 5.5 and 11 Mbps (average 5.6 ± 0.9 ms) are higher than those respecting the same AP type and open links (3.7 ± 0.5 ms) [9]. In Exp-B (Fig. 9.3), the jitter data (1.8 ± 0.1 ms on average) show fair agreement for IEEE 802.11a and 802.11g. Similar trends were observed for the same AP type and open links (1.9 ± 0.1 ms on average) [9]. For 802.11b the average value is 1.6 ± 0.1 ms. Also, the 802.11b data for 5.5 and 11 Mbps (average 2.5 ± 0.5 ms) are in good agreement with those respecting the same AP type and open links (2.6 ± 0.2 ms) [9]. The best jitter performance was for AP-B.

9 Wi-Fi Wep Point-to-Point Links

111

Fig. 9.4 UDP—percentage datagram loss results versus technology and nominal transfer rate; Exp-A and Exp-B

In both Exp-A and Exp-B (Fig. 9.4), generally, the percentage datagram loss data agree rather well for all standards. They are on average 1.2 ± 0.1%. This is in good agreement with the results for the same AP types and open links (on average, 1.3 ± 0.2% for AP-A and 1.2 ± 0.2% for AP-B) [9]. AP-A and AP-B have shown similar percentage datagram loss performances. At OSI level 7 (application layer), FTP transfer rates were measured versus nominal transfer rates configured in the APs for the IEEE 802.11a, b, g, standards. Every measurement was the average for a single FTP transfer, using a binary file size of 100 Mbytes. The results from Exp-A and Exp-B are represented in Fig. 9.5. Polynomial fits to data were made for the implementation of every standard. It was found that in both cases the best performances were, by descending order, for 802.11a, 802.11g and 802.11b: the same trends found for TCP throughput. The FTP transfer rates obtained in Exp-A, using IEEE 802.11b, were close to those in Exp-B. The FTP performances obtained for Exp-A and IEEE 802.11a were only slightly better in comparison with Exp-B. On the contrary, for Exp-A and IEEE 802.11g, FTP performances were significantly worse than in Exp-B, suggesting that AP-B had a better FTP performance than AP-A for IEEE 802.11g. Similar trends had been observed for corresponding open links [9].

112

J. A. R. P. de Carvalho et al.

Fig. 9.5 FTP transfer rate results versus technology and nominal transfer rate; Exp-A and Exp-B

Generally, the results measured for the WEP links agree reasonably well, within the experimental errors, with corresponding data obtained for open links.

9.4 Conclusions In the present work a laboratory setup arrangement was planned and implemented, that permitted systematic performance measurements of available access point equipments (RBT-4102 and RBTR2 from Enterasys) for Wi-Fi (IEEE 802.11a, b, g) in WEP point-to-point links. Through OSI layer 4, TCP throughput, jitter and percentage datagram loss were measured and compared for each standard. The best TCP throughputs were found by descending order for 802.11a, 802.11g and 802.11b. TCP throughputs were also

9 Wi-Fi Wep Point-to-Point Links

113

found sensitive to AP type. Similar trends were observed for the same AP types and open links. The lower average jitter values were found for IEEE 802.11a, and 802.11g. Some sensitivity to AP type was observed. For the percentage datagram loss, a reasonably good agreement was found, on average, for all standards and AP types. Similar trends were observed for the same AP types and open links. At OSI layer 7, the measurements of FTP transfer rates have shown that the best FTP performances were by descending order for 802.11a, 802.11g and 802.11b. This result shows the same trends found for TCP throughput. Similar trends were observed for the same AP types and open links. FTP performances were also found sensitive to AP type. Generally, the results measured for WEP links agree reasonably well, within the experimental errors, with corresponding data obtained for open links. Additional performance measurements either started or are planned using several equipments, not only in laboratory but also in outdoor environments involving, mainly, medium range links. Acknowledgments Supports from University of Beira Interior and FCT (Fundação para a Ciência e a Tecnologia)/POCI2010 (Programa Operacional Ciência e Inovação) are acknowledged. We acknowledge Enterasys Networks for their availability.

References 1. IEEE Std 802.11-2007, IEEE standard for local and metropolitan area networks-specific requirements-part 11: wireless LAN medium access control (MAC) and physical layer (PHY) specifications (10 October 2007); http://standards.ieee.org/getieee802 2. Mark JW, Zhuang W (2003) Wireless communications and networking. Prentice-Hall Inc, Upper Saddle River 3. Rappaport TS (2002) Wireless communications principles and practice, 2nd edn. PrenticeHall Inc, Upper Saddle River 4. Bruce WR III, Gilster R (2002) Wireless LANs end to end. Hungry Minds Inc, New York 5. Schwartz M (2005) Mobile wireless communications. Cambridge University Press, Cambridge 6. Sarkar NI, Sowerby KW (2006) High performance measurements in the crowded office environment: a case study. In: Proceedings ICCT’06-International Conference on Communication Technology, Guilin, China, 27–30 November, pp 1–4 7. Monteiro E, Boavida F (2002) Engineering of informatics networks, 4th edn. FCA-Editor of Informatics Ld, Lisbon 8. Pacheco de Carvalho JAR, Gomes PAJ, Veiga H, Reis AD (2008) Development of a university networking project. In: Putnik GD, Manuela Cunha M (eds) Encyclopedia of Networked and Virtual Organizations. IGI Global, Hershey, pp 409–422 9. Pacheco JAR, de Carvalho H, Veiga PAJ, Gomes CF, Ribeiro Pacheco N, Marques AD, Reis (2010) Wi-Fi Point-to-point Links—Performance Aspects of IEEE 802.11a, b, g Laboratory Links. In: Ao S-I, Gelman L (eds) Electronic Engineering and Computing Technology, Series: Lecture Notes in Electrical Engineering, vol 60. Springer, Netherlands, pp 507–514 10. Pacheco de Carvalho JAR, Veiga H, Marques N, Ribeiro Pacheco CF, Reis AD (2010) Laboratory performance of Wi-Fi WEP point-to-point links: a case study. Lecture notes in

114

11.

12. 13. 14. 15. 16.

J. A. R. P. de Carvalho et al. engineering and computer science: Proceedings of the World Congress on Engineering 2010, WCE 2010, vol I. 30 June–2 July, London, UK, pp 764–767 Pacheco de Carvalho JAR, Veiga H, Gomes PAJ, Cláudia F, Ribeiro Pacheco FP, Reis AD (2008) Experimental performance study of very high speed free space optics link at the university of Beira interior campus: a case study. In: Proceedings ISSPIT 2008-8th IEEE International Symposium on Signal Processing and Information Technology, Sarajevo, Bosnia and Herzegovina, December 16–19, pp 154–157 Enterasys Networks, Roam About R2, RBT-4102 Wireless Access Points (20 December 2008). http://www.enterasys.com Allied Telesis, AT-8000S/16 Layer 2 Managed Fast Ethernet Switch (20 December 2008). http://www.alliedtelesis.com NetStumbler software. http://www.netstumbler.com Iperf software, NLANR. http://dast.nlanr.net Network Working Group, RFC 1889-RTP: A Transport Protocol for Real Time Applications. http://www.rfc-archive.org

Chapter 10

Interaction Between the Mobile Phone and Human Head of Various Sizes Adel Zein El Dein Mohammed Moussa and Aladdein Amro

Abstract This chapter analyzes the specific absorption rate (SAR) induced in human head model of various sizes by a mobile phone at 900 and 1800 MHz. Specifically the study is considering in SAR between adults and children. Moreover, these differences are assessed for compliance with international safety guidelines. Also the effects of these head models on the most important terms for a mobile terminal antenna designer, namely: radiation efficiency, total efficiency and directivity, are investigated.

10.1 Introduction In recent years, much attention has been paid to health implication of electromagnetic (EM) waves, especially human head part, which is exposed to the EM fields radiated from handsets. With the recent explosive increase of the use of mobile communication handsets, especially the number of children using a mobile phone, that develops many questions about the nature and degree of absorption of EM waves by this category of public as a function of their age and their morphology. For this reason the World Health Organization (WHO) has recommended

A. Z. E. D. M. Moussa (&) Department of Electrical Engineering, High Institute of Energy, South Valley University, Aswan, 81258, Egypt e-mail: [email protected] A. Amro Department of Communications Engineering faculty of Engineering, Al-Hussein Bin Talal University, P.O. Box 20, Ma’an, Jordan e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_10, Ó Springer Science+Business Media B.V. 2011

115

116

A. Z. E. D. M. Moussa and A. Amro

100%

95%

90%

85%

80%

Fig. 10.1 Description of the various sizes of human head models

to undertake research studies on this subject [1–3]. This chapter investigates the effects of head models of various sizes on the most important terms for a mobile terminal antenna designer, namely: radiation efficiency, total efficiency and directivity; and also on the Specific Absorption Rates (SAR) which are induced in them. For this purpose, a comparison is performed concerning those parameters between an adult human head and some children heads obtained as a percent of an adult human head. The results are obtained using an electromagnetic field solver employing the Integral Equations method [4]. The SAR is the most appropriate metric for determining EM effect exposure in the very near field of a Radio Frequency (RF) source [5–9]. The local SAR (W/kg) at any point in the human head is defined as: SAR ¼

rE2 2q

ð10:1Þ

Where E is the peak amplitude of the electrical field in the human head tissue (V/m), r is the tissue conductivity (S/m) and q is the tissue density (kg/m3). The SAR over a mass of 10 and 1 g in the head and the other parameters of the mobile antenna are determined in each case.

10.2 Modeling of Human Head For this study, five head models are used namely: that of an adult and other children human heads of sizes; 95, 90, 85, and 80% of the adult head size (which of size 100%), as they are shown in Fig. 10.1. Each head model consists of shell of skin tissue which is filled with a liquid of brain properties. For simulation of the EM fields in the human head, the appropriate parameters for the conductivity r (S/m), the relative permittivity er and the tissue density q (kg/m3) of all different materials used for the calculation must be known. Additionally, the frequency dependence of these parameters must be considered and chosen appropriately. A recent compilation of Gabriel et al. covers a wide range of different body tissues

10

Interaction Between the Mobile Phone and Human Head of Various Sizes

117

Table 10.1 Dielectric permittivity er , conductivity r (S/m), and mass density q (kg/m3) of tissues used in the simulations at 900 and 1800 MHz Properties of tissues Dielectric Conductivity Mass density q(kg/m3) permittivity er r (S/m) Shell (skin) Liquid (brain)

900 MHz 1800 MHz 900 MHz 1800 MHz

43.8 38.87 45.8 43.5

0.86 1.19 0.77 1.15

1000 1000 1030 1030

Table 10.2 Volume and mass of the heads’ models The volume and mass of the human head Human head size as a percent of an adult one (%) Tissue volume (mm3)*106 Tissue mass (kg)

100

95

90

85

80

5.5893 5.7439

4.7886 4.9236

4.0706 4.1855

3.4283 3.5250

2.8573 2.9379

and offers equations to determine the appropriate dielectric values at each desired frequency [10, 11]. Table 10.1 shows the real part of the dielectric permittivity er, conductivity r (S/m), and mass density q (kg/m3) of tissues used in the simulations at 900 and 1800 MHz. Table 10.2 shows the volume and the mass of the tissue of all children heads.

10.3 Modeling of the Mobile Phone The mobile handset consists of a quarter-wavelength monopole (of radius 0.0025 m at 900 MHz and 0.001 m at 1800 MHz) mounted on a mobile handset (treated as a metal box of 1.8 9 4 9 10 cm), operates at 900 and 1800 MHz and radiated power of 0.125 W, as it is shown in Fig. 10.2.

10.4 Results and Discussion Figures 10.3 and 10.4 present mobile terminal antenna designer parameters namely: return loss, radiation efficiency, total efficiency and directivity, the results obtained with the absence of the human head and at a frequency 1800 MHz. Table 10.3 present the Mobile Antenna Parameters, namely: radiation efficiency, total efficiency and directivity, the results obtained for various sizes of human heads and for the case of there absence. It is seen that as the size of human head decreases the radiation efficiency and total efficiency decrease, in the other side the directivity increases. The differences between the results of SAR of different kinds are given in Table 10.4 for each frequency and for each studied child head model.

118

A. Z. E. D. M. Moussa and A. Amro

Fig. 10.2 Description of the mobile handset

Fig. 10.3 Return loss without human head

The ‘‘SAR 10 grams’’ is the maximum SAR value averaged on 10 g which is obtained by averaging the SAR around each point in the volume and adding the nearest points till an average mass of 10 g is reached with a resulting volume having the shape of a portion of sphere. The ‘‘contiguous SAR 1 gram’’ is estimated by averaging the local maximum SAR, adding the highest SAR volume in a given tissue till a mass of 1 g is reached. The SAR (point) is the local value of SAR at every point inside the head model. The results show that by decreasing the head size the peak SAR 1 g and peak SAR 10 g decrease, however the percentage of absorbed power in the human head increases. So, the local SAR (point) and total SAR in children’s heads increase as children’s heads decrease, as indicated in Table 10.3. Also from Table 10.3 it is noticed that, the total SAR over the whole human head at 1800 MHz is less than that at 900 MHz. This is because the SAR regions produced by monopole antenna at 900 MHz are more extended as

10

Interaction Between the Mobile Phone and Human Head of Various Sizes

119

Fig. 10.4 Far field without human head

Table 10.3 Mobile antenna parameters with various sizes of human head Mobile antenna parameters Without Human head size as a percent from an adult human one (%) head 100 95 90 85 80 900 MHz

1800 MHz

Rad. Tot. Dir. Rad. Tot. Dir.

g g (dBi) g g (dBi)

1.003 0.788 2.627 1.003 1.002 3.653

0.276 0.271 6.066 0.485 0.476 7.98

0.292 0.286 5.943 0.498 0.489 7.855

0.311 0.305 5.859 0.512 0.504 7.756

0.335 0.328 5.819 0.528 0.521 7.673

0.357 0.35 5.712 0.546 0.539 7.552

compared to those induced at 1800 MHz. The human body works as a barrier, mainly in high frequencies, because of skin depth. As the frequency increases the penetration capacity decreases and become more susceptible to obstacles. Figures 10.5, 10.6, 10.7, 10.8, 10.9, 10.10 show the distributions of the local SAR, at the y = 0 plane; 10 g SAR in xz plane; and 1 g SAR in xy plane; in (W/kg), on the human head of various sizes, obtained with a radiated power of

120

A. Z. E. D. M. Moussa and A. Amro

Table 10.4 SAR induced in children’s heads Calculated parameters of the human head Human head size as a percent from an adult one (%) 900 MHz

1800 MHz

SAR (point) SAR 1 g SAR 10 g Absorbed power (wrms) Total SAR (W/kg) SAR (point) SAR 1 g SAR 10 g Absorbed power (wrms) Total SAR (W/kg)

100

95

90

85

80

1.134 0.818 0.593 0.089 0.016 4.149 1.590 0.922 0.064 0.011

1.206 0.805 0.59 0.087 0.018 3.078 1.530 0.887 0.062 0.012

1.124 0.785 0.584 0.085 0.02 2.404 1.482 0.848 0.060 0.014

1.122 0.769 0.58 0.082 0.023 2.319 1.399 0.805 0.058 0.016

1.214 0.769 0.572 0.079 0.027 2.282 1.312 0.764 0.056 0.019

Fig. 10.5 Distributions of the local SAR at x = 0 plane for 1800 MHz

125 mW from a monopole antenna operates at 900 and 1800 MHz respectively. It can be easily noticed that high SAR regions produced by 900 MHz monopole antenna are more extended as compared to those induced by 1800 MHz monopole antenna, as it is explained before.

10.5 Conclusion The obtained results show that the spatial-peak SAR values at a point or as averaged over 1 and 10 g on the human head of various sizes, obtained with a radiated power of 125 mW from a monopole antenna operates at 900 and

10

Interaction Between the Mobile Phone and Human Head of Various Sizes

121

Fig. 10.6 Distributions of the (10 g) SAR at xz plane for 1800 MHz

Fig. 10.7 The distributions of the (1 g) SAR at xy plane for 1800 MHz

1800 MHz, vary with the size of the human’s head at each frequency. Also the sizes of the head have an effect on the mobile terminal antenna designer parameters, and this effect can’t be eliminated, because it is an electromagnetic

122

A. Z. E. D. M. Moussa and A. Amro

Fig. 10.8 Distributions of the local SAR at x = 0 plane for 900 MHz

Fig. 10.9 Distributions of the (10 g) SAR at xz plane for 900 MHz

10

Interaction Between the Mobile Phone and Human Head of Various Sizes

123

Fig. 10.10 Distributions of the (1 g) SAR at xy plane for 900 MHz

characteristic. The obtained results show that the spatial-peak SAR values as averaged over 1 g on the human head obtained with a radiated power of 0.125 W for all simulations are well below the limit of 1.6 W/kg, which is recommended by FCC and ICNIRP [12–14].

References 1. International Association of Engineers [Online]. Available: http://www.iaeng.org 2. El Dein AZ, Amr A (2010) Specific absorption rate (SAR) induced in human heads of various sizes when using a mobile phone at 900 and 1800 MHz. Lecture notes in engineering and computer science: Proceeding of the World Congress on Engineering 2010, Vol I, WCE 2010, 30 June–2 July, London, UK, pp 759–763 3. Kitchen R (2001) RF and microwave radiation safety handbook, Chapter 3, 2nd edn. Newnes, Oxford, pp 47–85 4. CST Microwave studio site. Available: http://www.cst.com/ 5. Kiminami K, Iyama T, Onishi T, Uebayashi S (2008) Novel specific absorption rate (SAR) estimation method based on 2-D scanned electric fields. IEEE Trans Electromagn Compat 50(4):828–836 6. Watanabe S, Taki M, Nojima T, Fujiwara O (1996) Characteristics of the SAR distributions in a head exposed to electromagnetic fields radiated by a hand-held portable radio. IEEE Trans Microwave Theory Tech 44(10):1874–1883 7. Hadjem A, Lautru D, Dale C, Wong MF, Fouad-Hanna V, Wiart J (2004) Comparison of specific absorption rate (SAR) induced in child-sized and adult heads using a dual band mobile phone. Proceeding on IEEE MTT-S Int. Microwave Symposium Digest, June 2004

124

A. Z. E. D. M. Moussa and A. Amro

8. Kivekäs O, Ollikainen J, Lehtiniemi T, Vainikainen P (2004) Bandwidth, SAR, and efficiency of internal mobile phone antennas. IEEE Trans Electromagn Compat 46(1):71–86 9. Beard BB et al (2006) Comparisons of computed mobile phone induced SAR in the SAM phantom to that in anatomically correct models of the human head. IEEE Trans Electromagn Compat 48(2):397–407 10. Gabriel C (1996) Compilation of the Dielectric Properties of Body Tissues at RF and Microwave Frequencies.‘‘Brooks Air’’ Force Technical Report AL/OE-TR-1996-0037 [Online]. Available: http://www.fcc.gov/cgi-bin/dielec.sh 11. El Dein AZ (2010) Interaction between the human body and the mobile phone. Book Published by LAP Lambert Academic, ISBN 978-3-8433-5186-7 12. FCC, OET Bulletin 65, Evaluating Compliance with FCC Guidelines for Human Exposure to Radiofrequency Electromagnetic Fields. Edition 97-01, released December, 1997 13. IEEE C95.1-1991 (1992) IEEE standard for safety levels with respect to human exposure to radio frequency electromagnetic fields, 3 kHz to 300 GHz. Institute of Electrical and Electronics Engineers, Inc., New York 14. European Committee for Electrotechnical Standardization (CENELEC) (1995) Prestandard ENV 501 66-2, Human exposure to electromagnetic fields. High frequency (10 kHz to 300 GHz)

Chapter 11

A Medium Range Gbps FSO Link Extended Field Performance Measurements J. A. R. Pacheco de Carvalho, N. Marques, H. Veiga, C. F. Ribeiro Pacheco and A. D. Reis

Abstract Wireless communications have been increasingly important. Besides Wi-Fi, FSO plays a very relevant technological role in this context. Performance is essential, resulting in more reliable and efficient communications. A 1.14 km FSO medium range link has been successfully implemented for high requirement applications at Gbps. An extended experimental performance evaluation of this link has been carried out at OSI levels 1, 4 and 7, through a specifically planned field test arrangement. Several results, obtained namely from simultaneous measurements of powers received by the laser heads for TCP, UDP and FTP experiments, are presented and discussed.

J. A. R. P. de Carvalho (&) C. F. Ribeiro Pacheco A. D. Reis Unidade de Detecção Remota, Universidade da Beira Interior, 6201-001, Covilhã, Portugal e-mail: [email protected] C. F. Ribeiro Pacheco e-mail: [email protected] A. D. Reis e-mail: [email protected] N. Marques H. Veiga Centro de Informática, Universidade da Beira Interior, 6201-001, Covilhã, Portugal e-mail: [email protected] H. Veiga e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_11, Ó Springer Science+Business Media B.V. 2011

125

126

J. A. R. P. de Carvalho et al.

11.1 Introduction Wi-Fi and FSO are wireless communications technologies whose importance and utilization have been growing for their versatility, mobility, speed and favourable prices. Wi-Fi uses microwaves in the 2.4 and 5 GHz frequency bands and IEEE 802, 11a, b, g standards. Nominal transfer rates up to 11 (802.11b) and 54 Mbps (802.11a, g) are specified [1]. It has been used in ad hoc and infrastructure modes. Point-to-point and point-to-multipoint configurations are used both indoors and outdoors, requiring specific directional and omnidirectional antennas. FSO uses laser technology to provide point-to-point communications e.g. to interconnect LANs of two buildings having line-of-sight. FSO was developed in the 1960s for military and other purposes, including high requirement applications. At present, speeds typically up to 2.5 Gbps are possible and ranges up to a few km, depending on technology and atmospheric conditions. Interfaces such as fast Ethernet and Gigabit Ethernet are used to communicate with LAN’s. Typical laser wavelengths of 785, 850 and 1550 nm are used. In a FSO link the transmitters deliver high power light which, after travelling through atmosphere, appears as low power light at the receiver. The link margin of the connection represents the amount of light received by a terminal over the minimum value required to keep the link active: (link margin) dB = 10 log10 (P/Pmin), where P and Pmin are the corresponding power values, respectively. There are several factors related to performance degradation in the design of a FSO link: distance between optical emitters; line of sight; alignment of optical emitters; stability of the mounting points; atmospheric conditions; water vapour or hot air; strong electromagnetic interference; wavelength of the laser light [2]. A redundant microwave link is always essential, as the laser link can fail under adverse conditions and communications are interrupted. Several studies and implementations of FSO have been reported [3, 4]. FSO has been used in hybrid systems for temporary multimedia applications [5]. Performance has been a very important issue, resulting in more reliable and efficient communications. Telematic applications have specific performance requirements, depending on application. New telematic applications present special sensitivities to performances, when compared to traditional applications. E.g. requirements have been quoted as: for video on demand/moving images, 1–10 ms jitter and 1–10 Mbps throughput; for Hi Fi stereo audio, jitter less than 1 ms and 0.1–1 Mbps throughputs [6]. Several performance measurements have been made for Wi-Fi [7, 8]. FSO and fiber optics have been applied at the University of Beira Interior Campus, at Covilhã City, Portugal, to improve communications quality [9–12]. In the present work we have further investigated that FSO link, for extended performance evaluation at OSI levels 1, 4 and 7. The rest of the paper is structured as follows: Chap. 2 presents the experimental details i.e. the measurement setup and procedure. Results and discussion are presented in Chap. 3. Conclusions are drawn in Chap. 4.

11

A Medium Range Gbps FSO Link

127

Fig. 11.1 View of the 1.14 km laser link between Pole II (SB) and Pole III (FM)

11.2 Experimental Details The main experimental details, for testing the quality of the FSO link, are as follows. A 1 Gbps full-duplex link was planned and implemented, to interconnect the LAN at the Faculty of Medicine building and the main University network, to support medical imaging, VoIP, audio and video traffics [9, 10]. Then, a FSO laser link at 1 Gbps full-duplex, over a distance of 1.14 km, was created to interconnect the Faculty of Medicine (FM) building at Pole III and the Sports (SB) building at Pole II of the University (Fig. 11.1). We have chosen laser heads from FSONA (Fig. 11.2) to implement the laser link at a laser wavelength of k = 1550 nm for eye safety, where allowable laser power is about fifty times higher at 1550 nm than at 800 nm [2–13]. Each laser head comprised two independent transmitters, for redundancy, and one wide aperture receiver. Each laser had 140 mW of power, resulting in an output power of 280 mW (24.5 dBm). 1000-Base-LX links over OM3 50/125 lm fiber were used to connect the laser heads to the LANs. For a matter of redundancy a 802.16d WiMAX point-to-point link at 5.4 GHz was available, where data rates up to either 75 Mbps or 108 Mbps were possible in normal mode or in turbo mode, respectively [14]. This link was used as a backup link for FM-SB communications, through configuration of two static routing entries in the switching/routing equipment [9].

128

J. A. R. P. de Carvalho et al.

Fig. 11.2 View of the laser heads at FM (Pole III) and SB (Pole II)

Performance tests of the FSO link were made under favourable weather conditions. During the tests we used a data rate mode for the laser heads which was compatible with Gigabit Ethernet. At OSI level 1 (physical layer), received powers were simultaneously measured for both laser heads. Data were collected. from the internal logs of the laser heads, using STC (SONAbeam Terminal Controller) management software [13]. At OSI level 4 (transport layer), measurements were made for TCP connections and UDP communications using Iperf software [15], permitting network performance results to be recorded. Both TCP and UDP are transport protocols. TCP is connection-oriented. UDP is connectionless, as it sends data without ever establishing a connection. For a TCP connection over a link, TCP throughput was obtained. For a UDP communication, we obtained UDP throughput, jitter and percentage loss of datagrams. TCP packets and UDP datagrams of 1470 bytes size were used. A window size of 8 kbytes and a buffer size of the same value were used for TCP and UDP, respectively. A specific field test arrangement was planned and implemented for the measurements (Fig. 11.3). Two PC’s having IP addresses 192.168.0.2 and 192.168.0.1 were setup as the Iperf server and client, respectively. The PCs were HP computers, with 3.0 GHz Pentium IV CPUs, running Windows XP. The server had a better RAM configuration than the client. They were both equipped with 1000Base-T network adapters. Each PC was connected via 1000Base-T to a C2 Enterasys switch [16]. Each switch had a 1000Base-LX interface. Each interface was intended to establish a FSO link through two laser heads, as represented in Fig. 11.3. The laser heads were located at Pole II and Pole III, at the SB and FM buildings, respectively. The experimental arrangement could be remotely accessed through the FM LAN. In the UDP tests a bandwidth parameter of 300 Mbps was used in the Iperf client. Jitter, which represents the smooth mean of differences between consecutive transit times, was continuously computed by the server, as specified by RTP in RFC 1889 [17]. RTP provides end-to-end network transport functions appropriate for applications

11

A Medium Range Gbps FSO Link

129

Fig. 11.3 Field tests setup scheme for the FSO link

transmitting real-time data, e.g. audio, video, over multicast or unicast network services. At OSI level 7 (application layer) the setup given in Fig. 11.3 was also used for measurements of FTP transfer rates through FTP server and client applications installed in the PCs. Each measurement corresponded to a single FTP transfer, using a 2.71 Gbyte file. Whenever a measurement was made at either OSI level 4 or 7, data were simultaneously collected at OSI level 1. Batch command files were written to enable the TCP, UDP and FTP tests. The results, obtained in batch mode, were recorded as data files in the client PC disk.

11.3 Results and Discussion Several sets of data were collected and processed. The TCP, UDP and FTP experiments were not simultaneous. The corresponding results are shown for TCP in Fig. 11.4, for UDP in Fig. 11.6 and FTP in Fig. 11.8. The average received powers for the SB and FM laser heads, mostly ranged high values in the 25–35 lW interval which corresponds to link margins of 4.9–6.4 dB (considering Pmin = 8 lW). From Fig. 11.4 it follows that TCP average throughput (314 Mbps) is very steady; some small peaks arise for throughput deviation. Figure 11.5 illustrates details of TCP results over a small interval. Figure 11.6 shows that UDP average throughput (125 Mbps) is fairly steady, having a small steady throughput deviation. The jitter is small, usually less than 1 ms, while percentage datagram loss is practically negligible. Figure 11.7 illustrates details of UDP-jitter results over a small interval. Figure 11.8 shows that average FTP throughput (344 Mbps) is very steady, having low throughput deviation.

130

J. A. R. P. de Carvalho et al.

Fig. 11.4 TCP results

Fig. 11.5 Details of TCP results

Figure 11.9 illustrates details of FTP results over a small interval. Transfer rates of the PC’s disks are always a limitation in this type of FTP experiments. In all cases, high values of average received powers were observed. The quantities under analysis did not show on average significant variations even when the received powers varied. The results here obtained complement previous work by the authors [9–12]. Generally, for our experimental conditions, the FSO link has exhibited very good performances at OSI levels 4 and 7. Besides the present results, it must be mentioned that we have implemented a VoIP solution based on Cisco Call Manager [18]. VoIP, with G.711 and G729A coding algorithms, has been working over the laser link without any performance problems. Tools such as Cisco IP Communicator have been used. Video and sound have also been tested through the laser link, by using eyeBeam Softphone CounterPath software [19]. Applications using the link have been well-behaved.

11

A Medium Range Gbps FSO Link

131

Fig. 11.6 UDP results; 300 Mbps bandwidth parameter

Fig. 11.7 Details of UDP-jitter results; 300 Mbps bandwidth parameter

11.4 Conclusions A FSO laser link at 1 Gbps has been successfully implemented over 1.14 km along the city, for interconnecting Poles of the University and support high requirement applications. A field test arrangement has been planned and implemented, permitting extended performance measurements of the FSO link at OSI levels 1, 4 and 7. At OSI level 1, received powers were simultaneously measured in both laser heads.

132

J. A. R. P. de Carvalho et al.

Fig. 11.8 FTP results

Fig. 11.9 Details of FTP results

At OSI level 4, TCP throughput, jitter and percentage datagram loss were measured. Through OSI level 7, FTP transfer rate data were acquired. Under favourable weather conditions, when the measurements were carried out, the link has behaved very well, giving very good performances. Applications such as VoIP, video and sound, have been well-behaved. Further measurements are planned under several experimental conditions. Acknowledgments Supports from University of Beira Interior and FCT (Fundação para a Ciência e a Tecnologia)/POCI2010 (Programa Operacional Ciência e Inovação) are acknowledged. We acknowledge Hewlett Packard and FSONA for their availability.

11

A Medium Range Gbps FSO Link

133

References 1. IEEE Std 802.11-2007 (2007) IEEE Standard for Local and metropolitan area networksSpecific Requirements-Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) specifications (October 10, 2007); http://standards.ieee.org/getieee802 2. Rockwell DA, Mecherle GS (2001) Wavelength selection for optical wireless communication systems. Proc SPIE 4530:27–35 3. Amico MD, Leva A, Micheli B (2003) Free space optics communication systems: first results from a pilot field trial in the surrounding area of Milan Italy. IEEE Microwave Wirel Compon Lett 13(8):305–307 August 4. Löschnigg M, Mandl P, Leitgeb E (2009) Long-term performance observation of a free space optics link. In: Proceedings of the 10th International Conference on TelecommunicationsContel, Zagreb, Croatia, June 8–10, pp 305–310 5. Mandl P, Chlestil Ch, Zettl K, Leitgeb E (2007) Hybrid systems using optical wireless, fiber optics and WLAN for temporary multimedia applications. In: Proceedings of the 9th International Conference on Telecommunications-Contel, Zagreb, Croatia, June 13–15, pp 73–76 6. Monteiro E, Boavida F (2002) Engineering of informatics networks, 4th edn. FCA-Editor of Informatics Ld, Lisbon 7. Pacheco de Carvalho JAR, Gomes PAJ, Veiga H, Reis AD (2008) Development of a university networking project. In: Putnik GD, Manuela Cunha M (eds) Encyclopedia of Networked and Virtual Organizations. IGI Global, Hershey, pp 409–422 8. Pacheco de Carvalho JAR, Veiga H, Marques N, Ribeiro Pacheco CF, Reis AD (2010) Laboratory performance of Wi-Fi WEP point-to-point links: a case study. Lecture notes in engineering and computer science: Proceedings of The World Congress on Engineering, WCE 2010, vol I, London, UK, 30 June–2 July, pp 764–767 9. Pacheco de Carvalho JAR, Gomes PAJ, Veiga H, Reis AD (2007) Wi-Fi and very high speed optical links for data and voice communications. In: Proc. 2a Conferência Ibérica de Sistemas e Tecnologias de Informação, Universidade Fernando Pessoa, Porto, Portugal, 21–23 June, pp 441–452 10. Pacheco de Carvalho JAR, Veiga H, Gomes PAJ, Reis AD (2008) Experimental performance evaluation of a very high speed free space optics link at the university of Beira interior campus: a case study. In: Proc. SEONs 2008- VI Symposium on Enabling Optical Network and Sensors Porto, Portugal, 20–20 June, pp 131–132 11. Pacheco de Carvalho JAR, Veiga H, Gomes PAJ, Ribeiro Pacheco CFFP, Reis AD (2008) Experimental performance study of a very high speed free space optics link at the university of Beira interior campus: a case study. In: Proc. ISSPIT 2008-8th IEEE International Symposium on Signal Processing and Information Technology Sarajevo. Bosnia and Herzegovina, December 16–19, pp 154–157 12. Pacheco de Carvalho JAR, Marques N, Veiga H, Ribeiro Pacheco CF, Reis AD (2010) Field performance measurements of a Gbps FSO link at Covilha City, Portugal. Lecture notes in engineering and computer science: Proceedings of the world congress on engineering, WCE 2010, Vol I, 30 June–2 July, London, UK, pp 814–818 13. Web site http://www.fsona.com; SONAbeam 1250-S technical data; SONAbeam Terminal Controller management software 14. Web site http://www.alvarion.com; Breeze NET B100 data sheet 15. Web site http://dast.nlanr.net; Iperf software 16. Web site http://www.enterasys.com; C2 switch technical manual 17. Network Working Group. RFC 1889-RTP: A Transport Protocol for Real Time Applications, http://www.rfc-archive.org 18. Web site http://www.cisco.com; Cisco Call Manager; Cisco IP Communicator 19. Web site http://www.counterpath.com; eyeBeam Softphone CounterPath software

Chapter 12

A Multi-Classifier Approach for WiFi-Based Positioning System Jikang Shin, Suk Hoon Jung, Giwan Yoon and Dongsoo Han

Abstract WLAN fingerprint-based positioning systems are a viable solution for estimating the location of mobile stations. Recently, various machine learning techniques have been applied to the WLAN fingerprint-based positioning systems to further enhance their accuracy. Due to the noisy characteristics of RF signals as well as the lack of the study on environmental factors affecting the signal propagation, however, the accuracy of the previously suggested systems seems to have a strong dependence on numerous environmental conditions. In this work, we have developed a multi-classifier for the WLAN fingerprint-based positioning systems employing a combining rule. According to the experiments of the multi-classifier performed in various environments, the combination of the multiple numbers of classifiers could significantly mitigate the environment-dependent characteristics of the classifiers. The performance of the multi-classifier was found to be superior to that of the other single classifiers in all test environments; the average error distances and their standard deviations were much more improved by the multi-classifier in all test environments. J. Shin (&) S. H. Jung Department of Information and Communications Engineering, Korea Advanced Institute of Science and Technology, 373-1 Kusong-Dong, Yuseong-gu, Daejeon, 305-701, Korea e-mail: [email protected] S. H. Jung e-mail: [email protected] G. Yoon Department of Electrical Engineering, Korea Advanced Institute of Science and Technology, 373-1 Kusong-Dong, Yuseong-gu, Daejeon, 305-701, Korea e-mail: [email protected] D. Han Department of Computer Science, Korea Advanced Institute of Science and Technology, 373-1 Kusong-Dong, Yuseong-gu, Daejeon, 305-701, Korea e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_12, Ó Springer Science+Business Media B.V. 2011

135

136

J. Shin et al.

12.1 Introduction With the explosive proliferation of smart phones, WLAN (Wireless Local Area Network)-based positioning systems have increasingly become a main stream in Location-based Service (LBS) regimes. Compared with other technologies such as GPS [1], RFID [2], GSM [3], Ultrasonic [4], infrared-based systems [5], etc., the WLAN-based positioning systems have some advantages in terms of coverage and costs. Most of the researches on the WLAN-based positioning systems have used the so-called Received Signal Strength Indication (RSSI) from the wireless network access points mainly because the RSSI (or called fingerprint) is relatively easy to obtain using software and also one of the most relevant factors for positioning. Some studies have been reported that consider other factors such as Signal to Noise Ratio (SNR), Angle of Arrival (AOA), and Time of Arrival (TOA) for positioning systems. Milos et al. [6] examined the SNR as an additional input factor and reported that the consideration of both SNR and RSSI could increase the performance of the WLAN-based positioning system. Yamasaki et al. [7] reported that the AOA and TOA are also important factors in positioning. However, the acquisition of the factors including the AOA, TOA, and SNR are not always possible in every wireless network interface cards. Thus, the RSSI appears to have been adopted as a primary factor for the WLAN-based positioning systems. In fact, utilizing the strengths of Radio Frequency (RF) signals for the positioning may not be a simple work. Due to the intrinsic characteristics of the RF signals like multipath fading and interference between signals, the signal strength may severely change depending on the materials used, the positions of doors and windows, the widths of the passages, the numbers of APs deployed, etc. Even if the fundamental parameters are known previously, the derivation of the path loss function of a WLAN signal is extremely complex. In this reason, the WLAN fingerprint-based positioning systems have mostly used to take statistical approaches [6]. The statistical approaches previously suggested have applied various machine learning techniques to derive the positions from the measured fingerprints [2, 8–15]. Those techniques usually are comprised of two phases: off-line and on-line phases. In the off-line phase, fingerprints are captured at various positions of target place and stored in a database called a radio-map. In the on-line phase, the location of a fingerprint is estimated by comparing it with the stored fingerprints in the database. The main problem of the WLAN fingerprint-based positioning systems is that the system performance is too much environment-dependent; in other words, there are not yet any general solutions available for the WLAN fingerprint-based positioning systems. Each system is designed to tackle different environments, and there is no analysis on the relation between the algorithm used and the test

12

A Multi-Classifier Approach for WiFi-Based Positioning System

137

environments. One method may outperform other methods in an environment, but it may show inferior results in other environments. For instance, Youssef et al. [12] suggested a joint-clustering technique and confirmed in their evaluation that their proposed algorithm outperformed RADAR [2]. According to the experiment by Wilson et al. [11], however, the RADAR was found to have a superior performance as compared to the joint-clustering technique. Similarly, this kind of problem was also observed in our experiments. In this paper, we introduce a multi-classifier for the application of the WLAN fingerprint-based positioning systems. We have combined multiple classifiers to become an efficient environment-independent classifier that can realize the more stable and higher estimation accuracy in a variety of the environments. The motivation for using a multiple number of classifiers lies in the fact that the classifier performance is severely environment-dependent; thus, if we can select the most accurate classifier for a given situation, we may be able to achieve even better performance in diverse environments. In this work, a multiple number of classifiers were combined using the Bayesian combination rule [16] and majority vote [17]. To prove the combination effects of the classifiers, we have evaluated the proposed system in three different environments. The evaluation results revealed that the multi-classifier could outperform the single classifiers in terms of the average error distances and their standard deviations. This indicates that the proposed combining method is much more effective in mitigating the environment-sensitive characteristics of the WLANbased positioning systems. The remainder of this paper is organized as follows. The overview on the WLAN fingerprint-based positioning is given in Sect. 12.2. We introduce a multiclassifier for the WLAN fingerprint-based positioning systems in Sect. 12.3. Section 12.4 describes the experiment setup and results. Section 12.5 summarizes this work and suggests the future work.

12.2 Related Work The location estimation using the so-called WLAN fingerprint often refers to the machine learning problem due to the high complexity of the signal propagation estimation. In this reason, various machine learning techniques have been applied. The RADAR system developed by Bahl et al. [2] is considered one of the most representative WLAN fingerprint-based systems. In this system, the authors used the Pentium-based PCs as access points and also the laptop computers as mobile devices. The system uses the nearest neighbor heuristics and triangulation methods to infer a user location. It maintains the radio map which can chart the strength of the signals received from the different access points at some selected locations. Each signal-strength measurement is then compared against the radio map, and

138

J. Shin et al.

then the best matching positions are averaged, enabling the location estimation. Roos et al. [10] proposed the probability-based system which uses the received signal strength samples to create the probability distributions of the signal strength for some known locations. Once an input instance is given, it matches to these probability distributions to find out the location of the mobile device with the highest probability. The histogram method suggested by Castro et al. [18] is another example of the probability-based system. Instead of using Gaussian distribution, it derives the distribution of the signal strength from the learning data. In addition, the adaptive neural networks [13], decision tree [14, 15], and support vector machine [19] are popular on the WLAN-based positioning systems; Kushki et al. [8] suggested the kernelized distance calculation algorithm for the inference of the location of the measured RSSI. Recently, some researchers have focused on compensating the characteristics of the RF signals. Berna et al. [20] suggested the system using the database by considering the unstable factors related to open/close doors and humidity changing environments. They utilized some sensors to capture the current status of the environment. Yin [15] introduced the learning approach based on the temporally updated database in accordance with the current environment situation. Moraes [21] investigated the dynamic RSS mapping architecture. By Wilson Yeung et al. [11], the use of the RSSI was suggested that are transmitted from the mobile devices as an additional input. Thus, there are two types of databases: the RSSI transmitted by APs and the RSSI transmitted by mobile devices. In the on-line phase, the system inferences the multiple results from the databases and makes the final decision using the combining method. Some research efforts [12, 22] have tackled the issue on how to reduce the computational overhead mainly because the client devices are usually small, selfmaintained and stand-alone, having a significant limitation in their power supply. Youssef et al. [12] developed a joint-clustering technique for grouping some locations in order to reduce the computational cost of the system. In this method, a cluster is defined as a set of locations sharing the same set of access points. The location determination process is as follows: for a given RSSI data set, the strongest access points are selectively used to determine one cluster to search the most probable location. Chen et al. [22] suggested the method which selects the most discriminative APs in order to minimize the AP numbers used in the positioning system. This approach selects an appropriate subset of the existing features to the computational complexity problem. Reducing the number of APs is referred to as the dimension reduction in a signal space, which in turn reduces the computational overheads required on the mobile devices. The weak spot of the WLAN fingerprint-based positioning systems is that their performance is severely environment-dependent. One system may outperform the other methods in an environment; it may show an inferior performance in other environments. To solve this problem, we suggest a multi-classifier approach for the application of the WLAN fingerprint-based positioning systems, leading to the more accurate results.

12

A Multi-Classifier Approach for WiFi-Based Positioning System

139

12.3 Proposed Method We utilize the multiple numbers of classifiers using different algorithms to build a possibly environment-independent classifier [23]. The work of combining multiple numbers of classifiers to create a strong classifier has been a well-established research, particularly in the pattern recognition area, the so-called Multiple Classifier System (MCS) [24]. When it comes to the term ‘‘combining’’, it indicates a processing of selecting the most trustable prediction results attained from the classifiers. At least, two reasons may justify the necessity of combining multiple classifiers [25]. First, there are a number of classification algorithms available that were developed from different theories and methodologies for the current pattern recognition applications. For a specific application problem, usually, each one of these classifiers could reach a certain degree of success, but maybe none of them is totally perfect or at least one of them is not so good as expected in practical applications. Second, for a specific recognition problem, there are often many types of features which could be used to represent and recognize some specific patterns. These features are also represented in various diversified forms and it is relatively hard to lump them together for one single classifier to make a decision. As a result, the multiple classifiers are needed to deal with the different features. It also results in a general problem on how to combine those classifiers with different features to yield the improved performance. The location estimation using the WLAN fingerprint often refers to the classification problem because of the noisy characteristics of the RF signals. Many algorithms have been proposed based on the different machine learning techniques, but none of them could achieve the best performance in very diverse environments. At this point, we realized that utilizing the multiple numbers of classifiers could be a promising solution, as a general solution for the WLAN fingerprintbased positioning systems. In this work, we combined the Bayesian combination rule [16] and majority vote [17] for our multi-classifier. The Bayesian combination rule gives weights to the decisions of classifier based on the information in a basis prepared in learning phase. Usually, the basis is given in a form of matrix called a confusion matrix. The confusion matrix is constructed by the cross-validation with learning data in the off-line phase. The majority vote is a simple algorithm, which chooses the one selected by more than a half of the classifiers. Figure 12.1 illustrates the idea of our proposed system. In the off-line phase, the fingerprints are collected over the target environment as learning data. The fingerprint is a collection of the pair-wise data containing the MAC address of an access point and its signal strength. Usually, in one fingerprint, there are multiple tuples of this pair-wise data such as f\ap1 ; bssi1 [; \ap2 ; bssi2 [; \ap3 ; bssi3 [ . . .: g. After attaching the collected location labels to the fingerprints, the database stores the labeled-fingerprint data. After collecting the learning data, each classifier C constructs their own confusion matrix M (Fig. 12.2) using the cross-validation with the learning data. The

140

Fig. 12.1 The overview of multi-classifier

Fig. 12.2 An example of confusion-matrix

J. Shin et al.

12

A Multi-Classifier Approach for WiFi-Based Positioning System

141

confusion matrix would be used as an indicator of its classifier. If there are L possible locations in the positioning system, the M will be a L L matrix in which the entry Mi,j denotes the number of the instances collected in location i, that is assigned as location j by the classifier. From the matrix M, the total number of data collected in location i can be P obtained as a row sum Li¼1 Mi;j , and the total number of data assigned to location P j can be obtained as a column sum Lj¼1 Mi;j When there are K classifiers, there would be K confusion matrices MðkÞ ; 1 k k. In the on-line phase, for the measured Fingerprint x, the positioning results gained by K classifiers are Ck ðxÞ ¼ jk ; 1 k k, and the jk can be any location of the L possible locations. The probability that the decision made by the classifier Ck is correct can be measured as follows: uðjk Þ ¼ Pðx 2 jk jC1 ðxÞ ¼ j1 ; . . .; Ck ðxÞ ¼ jk Þ

ð12:1Þ

Equation 12.1 is called the belief function, and the value of this function is called the belief value. Assuming that all classifiers are independent each other, and applying the Bayes’ theorem to Eq. 12.1, the belief function uðjk Þ can be reformulated as: uðjk Þ ¼

K Y Pðx 2 jk \ Ci ðxÞ ¼ ji Þ i¼1

PðCi ðxÞ ¼ ji Þ

ð12:2Þ

The denominator and numerator in Eq. 12.2 can be calculated using the confusion matrix M. The denominator indicates the probability that the classifier ci will assign the unknown fingerprint x to ji . This can be presented as follows: PL j¼1 Mi;j PðCi ðxÞ ¼ ji Þ ¼ P L ð12:3Þ i;j¼1 Mi;j The numerator in the Eq. 12.2 means the probability that the classifier ci will assign the unknown fingerprint x collected in jk to ji . This term is simply described as below: Mj ;j Pðx 2 jk \ Ci ðxÞ ¼ ji Þ ¼ PL k i i;j¼1 Mi;j

ð12:4Þ

After applying Eqs. 12.3 and 12.4 to Eq. 12.2, Eq. 12.2 can be reformulated as: uðjk Þ ¼

K Y

M P L jk ;ji j¼1 Mi;j i¼1

If more than a half of estimation of the classifiers pointed a specific location, the location would be selected as the final result. Otherwise, the belief value of each prediction is calculated, and the location with the highest belief value would be the final result. In case there are many locations with the same highest belief value, the

142

J. Shin et al.

multi-classifier system determines the middle point of those locations as the final result. For example, assume that there are three classifiers, a, b, and c, and there are three possible locations, location 1, location 2 and location 3. After the off-line phase, the confusion matrices will be as follows: 1 0 18 4 7 C B MðaÞ ¼ @ 2 12 3 A 0 4 10 1 0 12 6 6 C B MðbÞ ¼ @ 3 9 3 A 0

2

5

11

14

2

2

B MðcÞ ¼ @ 4 2

11 7

1

C 5 A 13

If the classifiers a, b, and c assigned the unknown instance x to location 1, location 2, and location 3, respectively, the belief values of the predictions can be calculated as follows: 18 3 2 108 ¼ 29 15 22 9570 4 9 7 252 uðjb Þ ¼ ¼ 29 15 22 9570 7 3 13 273 ¼ uðjc Þ ¼ 29 15 22 9570

uðja Þ ¼

The multi-classifier assigns the location 3 to the unknown instance x, because the jc , the prediction of the classifier c, has the highest belief value.

12.4 Evaluation 12.4.1 Experimental Setup The performance of WLAN-based positioning systems depends on each environment itself where the evaluation is performed. In this reason, we evaluated the proposed multi-classifier in three different environments; Table 12.1 briefly illustrates the test environments. The testbed 1 implies an office environment; the dimension of the corridor in the office is 3 9 60 m. The office is on the third floor of the faculty building at the KAIST-ICC in Daejeon, South Korea. In the corridor, we have collected 100 samples of Fingerprints from 60 different locations. Each location is 1 m away from each other. The testbed 2 indicates another office

12

A Multi-Classifier Approach for WiFi-Based Positioning System

Table 12.1 Summary of testbeds

Type Dimension (m) Number of AP Distance between RP (m) Number of APs deployed Avg. number of APs in one sample Std.Dev of number of AP in sample

143

Testbed 1

Testbed 2

Testbed 3

Corridor 3 9 60 60 1 48 16.6

Corridor 4 9 45 45 1 69 16.8

Hall 15 9 15 25 3 36 13.9

1.89

4.24

3.48

environment where the dimension of the corridor is 4 9 45 m. The office is located on the second floor of the Truth building at the KAIST-ICC. We have collected 100 samples of the Fingerprint from 45 different locations. Each location is 1 m away from each other. The testbed 3 implies a large and empty space inside the building located at the first floor of the Lecture building at the KAIST-ICC. The dimension of the space is 15 9 15 m. In the testbed 3, we have collected 100 samples of the Fingerprints from 25 different locations. Each location is 3 m away from each other. Comparing the testbed 3 case with testbed 1 and 2 cases, there is no attenuation factors that may disturb any signal propagation. To collect the data, we have adopted the HTC-G1 mobile phone with Android 1.6 platform, and used the API provided by the platform. We have also used the half (50%) of the collected data as the learning data and the rest of data were used as the test data. To prove the better performance of the multi-classifier, we created the multi-classifier with three classifiers, k-NN (with k ¼ 3) [2], Bayesian [9], and Histogram classifiers [10]; the performance of the multi-classifier was compared with these three classifiers, as shown in Table 12.2.

12.4.2 Results We can observe from the results that none of the single classifier outperformed others in all three test environments. These results indicate that the performance of the WLAN fingerprint-based positioning systems is sensitively related to the environments and the multi-classifier is turned out to be much more effective in mitigating such characteristics of the WLAN signals. Figure 12.3 reports the average error distance with respect to the different numbers of APs. From the Fig. 12.3a and b, the performances of the classifiers are quite different according to the test environments. Although the testbed 1 and testbed 2 look similar each other in indoor environments, the performances in testbed 1 are better than those in testbed 2. Especially, the average error distance of k-NN classifier in testbed 1 was 1.21 m when 15 APs were used for positioning,

144

J. Shin et al.

Table 12.2 Summary of testbeds (meter) Avg Testbed 1

Testbed 2

Testbed 3

k-NN Histogram Bayesian Multi k-NN Histogram Bayesian Multi k-NN Histogram Bayesian Multi

4.0 2.6 2.8 2.4 1.3 2.0 1.3 1.1 4.8 5.8 5.6 4.5

Std.Dev

Max

Min

90th Percentile

5.3 3.8 3.9 3.6 3.0 2.5 1.8 1.6 4.5 4.6 5.1 4.5

43 29 25 25 44 26 17 13 22.5 22.5 22.5 22.5

0 0 0 0 0 0 0 0 0 0 0 0

12.0 7.0 7.0 7.0 3.0 5.0 3.0 3.0 18.03 20.62 21.21 18.03

Fig. 12.3 Average error distance versus number of AP used for positioning in a Testbed 1, b Testbed 2, and c Testbed 3 respectively

while it was 4.6 m in testbed 2. In case of the histogram classifier, the average error distances were 1.9 and 2.7 m with 15 APs in testbed 1 and testbed 2, respectively. With the same condition, the Naïve Bayesian classifier’s average error distances in the testbeds 1 and 2 were 1.25 and 2.47 m, respectively. Compared with other classifiers, the multi-classifier showed the more improved results. In the testbeds 1 and 2, the average error distances of the multi-classifier

12

A Multi-Classifier Approach for WiFi-Based Positioning System

145

with 15 APs were 1.1 and 2.3 m, respectively. In the testbed 3, the accuracies of all classifiers are extremely poorer than the results in other testbeds. Based on the findings, it is believed that the WLAN fingerprint-based positioning systems can show better performance in the office environments as compared to the hall environments involving a few attenuation factors. As shown in the Fig. 12.3, the multi-classifier may clearly mitigate the environment-dependent characteristics of the single classifier. From the results shown in Fig. 12.3, we can conclude that the multi-classifier is effective for reducing error distance in localization. Table 12.2 illustrates the performance summary of the classifiers. The standard deviation of the errors of the multi-classifier in the testbed 1 was 3.6 m, while the k-NN, Histogram, and Bayesian respectively showed 5.8, 3.8, and 3.9 m in their standard deviations. In the testbed 2, the standard deviations of the error of all classifiers were lower than the values in the testbed 1. The standard deviation of kNN, Histogram and Bayesian were 3.0, 2.5, and 1.8 m, respectively. The standard deviation of the error of the k-NN, histogram, and Bayesian classifier in testbed 3 were 4.5, 4.6, and 5.1 m, respectively. These results confirm that the standard deviation of the errors of WLAN fingerprint-based positioning systems is also dependent on the environments. The proposed multi-classifier outperformed others in all testbeds in terms of the standard deviations of the error. In testbed 1, 2, and 3, the standard deviations of the errors of the multi-classifiers were 3.6, 1.6, and 4.5, respectively, which are higher or equivalent performance compared with others. From the results, we confirmed that multi-classifier could mitigate the environment-dependent characteristics of the single classifier, and the performance of the multi-classifier was better than the others in all environments. Even if the improvement of performance was not remarkable, the results indicate that combining a number of classifiers is one of the promising approaches in constructing reliable and accurate WLAN fingerprint-based positioning systems.

12.5 Summary and Future Work In this paper, we have presented an environment-independent multi-classifier for the WLAN fingerprint-based positioning systems in an effort to mitigate the undesirable environmental effects and factors. We have developed a combining method of the multiple numbers of classifiers for the purpose of the error-correction. For example, if a single classifier predicted wrong, the other classifiers correct it. In other words, the classifiers in the multi-classifier can complement each other. We have evaluated the multi-classifier in three different environments with various environmental factors: the numbers of APs, the widths of corridor, the materials used, etc. The multi-classifier was constructed with three different classifiers; k-NN (with k ¼ 3), Bayesian, and Histogram classifiers. As a result, the multi-classifier showed a consistent performance in the diverse test environments while other classifiers showed an inconsistent performance. The performance of

146

J. Shin et al.

the multi-classifier tends to follow that of the single classifier showing the best performance. This means that the classifiers in the multi-classifier complement each other, and thus the errors are more effectively corrected. For the next step, we are going to investigate a more efficient combining rule. In this work, we have mixed the Bayesian combining rule and majority vote; however, the performance enhancement was too marginal. Considering the complexity overhead of using the multiple numbers of classifiers, the multi-classifier may not be a cost-effective approach. Finding the best combination of the classifiers will be another direction of our future work. We have tested only three classifiers, and two of them have taken similar approaches; the fingerprint is the only feature for positioning. There are numbers of systems considering various aspects of WLAN signals that use additional features. In the near future, we are going to implement and evaluate the multi-classifier with various types of classifiers. Acknowledgments This research was supported by the MKE(The Ministry of Knowledge Economy), Korea, under the ITRC(Information Technology Research Center) support program supervised by the NIPA(National IT Industry Promotion Agency) (NIPA-2010-(C1090-10110013)), and by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MEST) (No. 2008-0061123).

References 1. Enge P, Misra P (1999) Special issue on global positioning system. Proc IEEE 87(1):3–15 2. Bahl P, Padmanabhan V (2000) RADAR: an in-building RF-based user location and tracking system. Proc IEEE Infocom 2:775–784 3. Drane C, Macnaughtan M, Scott C (1998) Positioning GSM telephones. IEEE Commun Mag 36(4):46–54 4. Priyantha N, Chakraborty A, Balakrishnan H (2000) The cricket location-support system. In: Proceedings of the 6th Annual International Conference on Mobile Computing and Networking, pp 32–43 5. Want R, Hopper A, Falcão V, Gibbons J (1992) The active badge location system. ACM Trans Inf Syst (TOIS) 10(1):102 6. Borenovic M, Neskovic A (2009) Comparative analysis of RSSI, SNR and noise level parameters applicability for WLAN positioning purposes. In: Proceedings of the IEEEEUROCON, pp 1895–1900 7. Yamasaki R, Ogino A, Tamaki T, Uta T, Matsuzawa N, Kato T (2005) TDOA location system for IEEE 802.11 b WLAN. In: Proceedings of IEEE. WCNC’05, pp 2338–2343 8. Kushki A, Plataniotis K, Venetsanopoulos A (2007) Kernel-based positioning in wireless local area networks. IEEE Trans Mobile Comput 6(6):689–705 9. Madigan D, Elnahrawy E, Martin R (2005) Bayesian indoor positioning systems. In: Proceedings of INFOCOM, pp 1217–1227 10. Roos T, Myllymaki P, Tirri H, Misikangas P, Sievanen J (2002) A probabilistic approach to WLAN user location estimation. Int J Wirel Inf Netw 9(3):155–164 11. Yeung W, Ng J (2007) Wireless LAN positioning based on received signal strength from mobile device and access points. In: IEEE International Conference on Embedded and RealTime Computing Systems and Applications, pp 131–137

12

A Multi-Classifier Approach for WiFi-Based Positioning System

147

12. Youssef M, Agrawala A, Shankar A (2003) WLAN location determination via clustering and probability distributions. In: Proceedings of the First IEEE International Conference on Pervasive Computing and Communications, p 143 13. Borenovi M, Nekovic A, Budimir D (2009) Cascade-connected ANN structures for indoor WLAN positioning. Intell Data Eng Autom Learning-IDEAL 392–399 14. Chen Y, Yang Q, Yin J, Chai X (2006) Power-efficient access-point selection for indoor location estimation. IEEE Trans Knowl Data Eng 18(7):877–888 15. Yin J, Yang Q, Ni L (2008) Learning adaptive temporal radio maps for signal-strength-based location estimation. IEEE Trans Mobile Comput 7(7):869–883 16. Xu L, Krzyzak A, Suen C (1992) Methods of combining multiple classifiers and their application to hand writing recognition. IEEE Trans Syst Man Cybern 22:418–435 17. Kuncheva L (2001) Decision templates for multiple classifier fusion: an experimental comparison. Pattern Recogn 34(2):299–314 18. Castro P, Chiu P, Kremenek T, Muntz R (2001) A probabilistic room location service for wireless networked environments. In: Proceeding of the 3rd International Conference on Ubiquitous Computing, pp 18–34 19. Brunato M, Battiti R (2005) Statistical learning theory for location fingerprinting in wireless LANs. Comput Netw 47(6):825–845 20. Berna M, Lisien B, Sellner B, Gordon G, Pfenning F, Thrun S (2003) A learning algorithm for localizing people based on wireless signal strength that uses labeled and unlabeled data. In: Proceedings of IJCAI, pp 1427–1428 21. Moraes L, Nunes B (2006) Calibration-free WLAN location system based on dynamic mapping of signal strength. In: Proceedings of the 4th ACM International Workshop on Mobility Management and Wireless Access, pp 92–99 22. Chen Y, Yin J, Chai X, Yang Q (2006) Power efficient access-point selection for indoor location estimation. IEEE Trans Knowl Data Eng 1(18):878–888 23. Shin J, Han D (2010) Multi-classifier for WLAN fingerprint-based positioning system. Lecture notes in engineering and computer science: Proceedings of the World Congress on Engineering, WCE 2010, 30 June–2 July, London, UK, pp 768–773 24. Kittler J (1998) Combining classifiers: a theoretical framework. Pattern Anal Appl 1(1):18–27 25. Chen K, Wang L, Chi H (1997) Method of combining multiple classifiers with different features and their applications to text-independent speaker identification. Int J Pattern Recognit Artif Intell 11(3):417–445

Chapter 13

Intensity Constrained Flat Kernel Image Filtering, a Scheme for Dual Domain Local Processing Alexander A. Gutenev

Abstract A non-linear image filtering scheme is described. The scheme is inspired by the dual domain bilateral filter but owing to much simpler pixel weighting arrangement the computation of the result is much faster. The scheme relies on two principal assumptions: equal weight of all pixels within an isotropic kernel and a constraint imposed on the intensity of pixels within the kernel. The constraint is defined by the intensity of the central pixel under the kernel. Hence the name of the scheme: Intensity Constrained Flat Kernel (ICFK). Unlike the bilateral filter designed solely for the purpose of edge preserving smoothing, the ICFK scheme produces a variety of filters depending on the underlying processing function. This flexibility is demonstrated by examples of edge preserving noise suppression filter, contrast enhancement filter and adaptive image threshold operator. The latter classifies pixels depending on local average. The versatility of the operators already discovered suggests further potentials of the scheme.

13.1 Introduction The initial stimulus for the development of the proposed scheme arose from the need for noise suppressing, edge preserving smoothing filter with a quasi real-time performance. The literature on edge preserving smoothing is plentiful. The most successful methods employ a dual domain approach: they define the operation result as function of ‘‘distances’’ in two domains, spatial and intensity. The ‘‘distances’’ are measured from a reference pixel of the input image. Well known

A. A. Gutenev (&) Retiarius Pty Ltd., P.O. Box 1606, Warriewood, NSW 2102, Australia e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_13, Ó Springer Science+Business Media B.V. 2011

149

150

A. A. Gutenev

examples are SUSAN [1] or, in more general form, the bilateral filter [2]. The main design purpose of these filtering schemes was the adaptation of level of smoothing to the amount of detail available within the neighborhood of the reference pixel. The application of such schemes ranges from adaptive noise suppression to creation of cartoon-like scenes from real world photographs [3]. The main weakness of the bilateral filter is its slow execution speed due to exponential weighting functions applied to the image pixels in both spatial and intensity domains. There is a range of publications describing the ways of improving the calculation speed of the bilateral filter [4–6]. In this paper we shall see that the simplification of weighting functions in both spatial and intensity domains not only increases the speed of computation without loosing the essence of edge preserving smoothing, but also suggests a filter generation scheme, versatile enough to produce operators beyond the original task of adaptive smoothing.

13.2 Intensity Constrained Flat Kernel Filtering Scheme 13.2.1 Intensity Constrained Flat Kernel Filter as a Simplification of the Bilateral Filter The bilateral filter is considered here in the light of its original purpose: single pass application. The output of the bilateral filter [2] is given by the formula [5] Ipb ¼

1 X Gr ðjp qjÞ GrR jIp Iq j Iq ; Wpb q2S S

ð13:1Þ

where p and q are vectors describing the spatial position of the pixels p; q 2 S, where S is the spatial domain, the set of all possible pixel positions within the image, Ip and Iq are the intensities of the pixels at positions p and q, Ip, Iq 2 R, where R is the range or intensity 2 domain, the set of all possible intensities of the 1 x is the Gaussian weighting function, with separate image, Gr ðxÞ ¼ pﬃﬃﬃﬃ exp 2r 2 r 2p

weight parameters rS and rR for spatial and intensity components, Wpb ¼ P G p qj GrR Ip Iq is the normalization coefficient. q2S rS Formula (13.1) states that the resulting intensity Ibp of the pixel at position p is calculated as a weighted sum of intensities of all other pixels in the image with the weights decreasing exponentially with increase of the distance between the pixel at variable position q and the reference pixel at position p. The contributing distances are measured in both spatial and range domains. Owing to the digital nature of the signal, function (13.1) has a finite support and its calculation is truncated to that in the neighborhoods of the pixel at position p and intensity Ip. The size of the neighborhood is defined by parameters rS and rR and sampling rates in both spatial

13

Intensity Constrained Flat Kernel Image Filtering

151

Fig. 13.1 Components making the bilateral filter

and intensity domains. The computation scheme proposed below truncates (13.1) further by giving all pixels in the selected neighborhood the same spatial weight. Furthermore the intensity weighting part of (13.1) applied to the histogram of the neighborhood is reduced to a range constraint around the intensity Ip of the reference pixel. The idea is illustrated by Figs. 13.1 and 13.2. For simplicity a single-dimension signal is presented on the graphs. The components which make the output of the bilateral filter (Fig. 13.1) at a particular spatial position p are: (i) Part of the signal under the kernel centered at the pixel at p, (ii) Gaussian spatial weighting function with its maximum at pixel at p and ‘‘width’’ parameter rS, (iii) Histogram of the pixels under the kernel centered at pixel at p, (iv) Gaussian intensity weighting function with its maximum at Ip and ‘‘width’’ parameter rR. The components which make the proposed filtering scheme (Fig. 13.2) replace the components ii and iv, the Gaussians, with simple windowing functions. The flat

152

A. A. Gutenev

Fig. 13.2 Components making the Intensity constrained flat kernel filtering scheme

kernel works as a spatial filter selecting spatial information in the neighborhood of the reference pixel at p. This information in the form of a histogram is passed to the intensity filter, which limits the processed information to that in the intensity neighborhood of the reference pixel Ip. This is where the commonality between the bilateral and ICFK filtering schemes ends. For the ICFK scheme the result of the operation depends on the processing function applied to spatially pre-selected data. 8 < F Hvp KðpÞ ; if Hvp ðIp Þ 6¼ 1 ð13:2Þ IpICFK ¼ : G Hv ; if Hvp ðIp Þ ¼ 1; p

13

Intensity Constrained Flat Kernel Image Filtering

153

where Hvp is a histogram of the part of the image, which is masked by the kernel v with the centre at p, Hvp ðIp Þ is the pixel count of the histogram at the level Ip, Hvp jKðpÞ is the part of the histogram Hvp subject to constraint K(p). Introduction of the second function G, applied only when the intensity level Ip is unique within the region masked by the kernel, is a way of emphasizing the need for special treatment of potential outlayers. Indeed, if the intensity level of a pixel is unique within a sizeable neighborhood, the pixel most likely belongs to noise and should be treated as such. As will be shown below the selection of functions F and G, as well as the constraint K, defines the nature of the resulting filter, which includes but is not limited by adaptive smoothing. The output of the filter (13.2) also depends on the shape of the kernel v. Often in digital image processing, selection of a kernel shape is based on the speed of calculation of filter results as kernel scans across the image. In the case of ICFK filters, this translates into the speed of histogram updates during the scan. There is a significant number of publications [7–9] on methods of speeding up of histogram updates as a square kernel scans the image. In order to avoid shape distortion of the filter output it is more appropriate to use an isotropic kernel, a digital approximation of a circle. A method to speed up the histogram updates while scanning with an isotropic kernel is described in [10]. It is based on the idea proposed in [11]. In the analysis and examples below an isotropic kernel is used. Such a kernel is fully defined by its radius r. A few words have to be said about the choice of the constraint K(p). In the bilateral filter this role is played by the exponent. By separating the constraint function from the processing functions F and G an extra degree of freedom is added to the filtering scheme. One possible definition of K(p) is offered in Fig. 13.2, where the exponent is replaced by the window function with a fixed window size. K ð pÞ ¼ Ip d, where d is a fixed number that depends on the dynamic range of the source image. For example, for integral image types it is an integer. In some cases, when looking for dark features on a bright background one may want to employ stronger smoothing to the brighter part of the image and reduce smoothing as the intensity decreases. Then the constraint can take the form KðpÞ ¼ Ip Ip c;

ð13:3Þ

where c is a fixed ratio. Furthermore, one can make the constraint adaptive and for example shrink the domain of the function F as the variance within the area masked by the kernel increases: K ð pÞ ¼ Ip ½dmax a ðdmax dmin Þ;

154

A. A. Gutenev

where dmax and dmin are fixed minimum and maximum values for the intensity range, a¼

varðHvp Þ minq2S ðvarðHvq ÞÞ maxq2S ðvarðHvq ÞÞ minq2S ðvarðHvq ÞÞ

maxðvarðHvq ÞÞ minðvarðHvq ÞÞ 6¼ 0 q2S

q2S

var(Hvp Þ is the variance of the area under the kernel centered at p.

13.3 Operators Derived from Intensity Constrained Flat Kernel Filtering Scheme 13.3.1 Edge Preserving Smoothing Filter This filter can be considered a mapping of the bilateral filter into the ICFK filtering scheme. The functions F and G are given by the following formulae F ¼ Hvp jKðpÞ is the average intensity within that part of the histogram under the kernel mask, which satisfies the constraint K(p), G ¼ median Hvp ð13:4Þ is the median of the area under the kernel mask. The median acts as a spurious noise suppression filter. From a computational point of view, the update of the histogram as the kernel slides across the image is the slowest operation. It was shown in [10] that the updates of the histogram and the value of the median for an isotropic kernel can be performed efficiently and require O(r) operations, where r is the radius of the kernel. The edge preserving properties of the filter emanate from the adaptive nature of the function F. The histogram Hvp is a statistic calculated within the mask of neighborhood v of the pixel at p and comprises intensities of all pixels within that neighborhood. However, the averaging is applied only to the intensities, which are in a smaller intensity neighborhood of Ip constrained by K(p). Thus the output value is similar in intensity to Ip and intensity-similar features from the spatial IICFK p neighborhood are preserved in the filter output. If the level Ip is unique in the neighborhood, it is considered as noise and is replaced by the neighborhood median. An example of the application of the filter is given in Fig. 13.3. The condition (13.3) was used as a constraint. The filter is effective against small particle noise; such as noise produced by camera gain, where linear or median filters would not only blur the edges but would also create perceptually unacceptable noise lumps.

13

Intensity Constrained Flat Kernel Image Filtering

155

Fig. 13.3 Fragment of an underwater image 733 9 740 pixels with a large number of suspended particles and the result of application of the edge preserving smoothing filter with the radius r = 12, subject to intensity constraint K(p) = Ip ± Ip 0.09

Similarly to the bilateral filter, application of the proposed filter gives the areas with small contrast variation a cartoon-like appearance. Use of flat kernels for image smoothing was the first choice from the conception of image processing. Other filtering schemes also place some constraints on the pixels within the kernel mask. A good example is the sigma filter [12] and its derivatives [13]. The fundamental difference between the sigma filter and the proposed filter is in the treatment the pixels within the mask. The sigma filter applies the filtering action, mean operator to all the pixels within the mask, if the central pixel is within the certain tolerance, r range of the mean of the area under the mask, otherwise the filtering action is not applied and the pixel’s input value is passed directly to the output. In the proposed filter the Hamlet’s question, ‘‘to filter or not to filter’’ is never posed. The filtering action is always applied but only to the pixel subset, which falls within certain intensity range of the central pixel. Moreover the filter output depends only on that, reduced range, not whole region under the kernel mask as in the sigma filter.

13.3.2 Contrast Enhancement Filter for Low Noise Images The expression (13.2) is general enough to describe not only ‘‘smoothing’’ filters, but ‘‘sharpening’’ ones as well. Consider the following expression for the operator function F: 8 < min Hvp jKðpÞ ; if Ip Hvp ; ð13:5Þ F¼ : max Hv j if Ip Hvp p KðpÞ ; where Hvp is the average intensity of the area under the kernel v at p.

156

A. A. Gutenev

Fig. 13.4 An example of a dermatoscopic image 577 9 434 pixels of a skin lesion

For the purpose of noise suppression the function (13.4) is the recommended choice for G in (13.2). The function F pushes the intensity of the output to one of the boundaries defined by the constraint, depending on the relative position of the reference intensity Ip and the average intensity under the kernel. As any other sharpening operator, the operator (13.5) amplifies the noise in the image. Hence it is most effective on low noise images. Dermatoscopic images of skin lesions can make a good example of this class of images. Dermatoscopy or epiluminescence microscopy is a technique for imaging skin lesions using oil immersion. The latter is employed in order to remove specular light reflection from the skin surface. This technique has a proven diagnostic advantage over clinical photography where images are taken without reflection suppressing oil immersion [14, 15] (Fig. 13.4). Normally the technique uses controlled lighting conditions. With proper balance of light intensity and camera gain, images taken with digital cameras would have a very low level of electronic noise, while the specular reflection noise is removed by the immersion. An example of such an image is given in Fig. 13.5. Some of the lesions can have a very low inter-feature contrast. Thus both image processing techniques as well as visual inspections can benefit from contrast enhancement. The images in Figs. 13.6 and 13.7 show application of the filter (13.5) and clearly indicate that the constraint parameter c (13.3) gives a significant level of control over the degree of the enhancement. There is another property of this filter that is worth emphasizing: due to its intrinsic nonlinearity this filter does not produce any ringing at the edges it enhances. The proposed filter in spirit is not unlike the toggle contrast filter [16]. The difference lies in the degree of contrast enhancement, which in case of the proposed filter has an additional control, the intensity constraint K(p). This control allows making the contrast change as strong as that of toggle contrast filter or as subtle as no contrast change at all.

13

Intensity Constrained Flat Kernel Image Filtering

157

Fig. 13.5 The dermatoscopic image after application of the contrast enhancement filter with the radius r = 7, subject to intensity constraint K(p) = Ip ± Ip 0.03

Fig. 13.6 The dermatoscopic image after application of the contrast enhancement filter with the radius r = 7, subject to intensity constraint K(p) = Ip ± Ip 0.1

13.3.3 Local Adaptive Threshold If sharpening could be considered a dual operation to smoothing and a processing scheme producing a smoothing filter is naturally expected to produce a sharpening one, then here is an example of the versatility of the ICFK scheme and its ability to produce somewhat unexpected operators still falling within the definition (13.2). Consider a local threshold operator defined by the functions: ( 1; if Hvp 2 Hvp jKðpÞ ; F ¼G¼ ð13:6Þ 0; if Hvp 62 Hvp jKðpÞ where Hvp is the average intensity of the area under the kernel v at p. The operator (13.6) produces a binary image, attributing to the background the pixels at which local average for the whole area under the kernel v at p is outside the constrained part of the histogram. The detector (13.6) can be useful in

158

A. A. Gutenev

Fig. 13.7 Dermatoscopic image 398 9 339 pixels of a skin lesion with hair and overlay of direct application of the local adaptive threshold with kernel of radius r = 5 and intensity constraint K(p) = Ip ± Ip 0.2

Fig. 13.8 Overlay of application of the local adaptive threshold with kernel of radius r = 5 and intensity constraint K(p) = Ip ± Ip 0.2 followed by morphological cleaning

identifying the narrow linear features in the images. Here is an example, one of the problems in the automatic diagnosis of skin lesions using dermoscopy is removal of artifacts like hairs and oil bubbles trapped in the immersion fluid. The detector (13.6) can identify both of those features as they stand out on the local background. The left half of Fig. 13.7 shows the image with hair and some bubbles. In automated lesion diagnosis systems hair and the bubbles are undesirable artifacts which need to be detected as non-diagnostic features. Prior to application of the operator (13.6) the source image needs to be preprocessed in order to remove the ringing around the hairs caused by sharpening in the video capture device. The preprocessing consists in application of the edge preserving smoothing filter (13.2) with the kernel radius r = 3 and the intensity constraint (13.3) where c = 0.08. Direct application of filter

13

Intensity Constrained Flat Kernel Image Filtering

159

(13.6) to the preprocessed image gives the combined hair and bubble mask, which is presented as an overlay on the right of Fig. 13.7. Application of the same filter followed by post-cleaning, which utilizes some morphological operations is presented in Fig 13.8. The advantage of this threshold technique is in its adaptation to the local intensity defined by the size of the processing kernel. All ICFK filters described above are implemented and available as part of the Pictorial Image ProcessorÓ package at www.pic-i-proc.com. The significant part of this work was first presented in [17]. Acknowledgments The author thanks Dr. Scott Menzies from Sydney Melanoma Diagnostic Centre and Michelle Avramidis from the Skintography Clinic for kindly providing dermatoscopic images. Author is also grateful to Prof. H. Talbot for pointing out some similarities between the proposed filters and existing filters.

References 1. Smith SM, Brady JM (1997) SUSAN–a new approach to low level image processing. Int J Comput Vis 23(1):45–78 2. Tomasi C, Manduchi R (1998) Bilateral filtering for gray and color images. In: Proceedings of the 1998 IEEE International Conference on Computer Vision. Bombay, India, pp 839–846 3. Kang H, Lee S, Chui CK (2009) Flow based image abstraction. IEEE Trans Vis Comput Graph 16(1):62–76 4. Durand F, Dorsey J (2002) Fast bilateral filtering for the display of high-dynamic-range images. ACM Trans Graph 21(3):257–266 5. Paris S, Durand F (2009) A fast approximation of the bilateral filter using a signal processing approach. Int J Comput Vis 81(1):24–52 6. Elad M (2002) On the bilateral filter and ways to improve it. IEEE Trans Image Process 11(10):1141–1151 7. Gil J, Werman M (1993) Computing 2-D min, median and max. IEEE Trans Pattern Anal Mach Intell 15:504–507 8. Weiss B (2006) Fast median and bilateral filtering. ACM Trans Graph (TOG) 25(3):519–526 9. Perreault S, Hebert P (2007) Median filtering in constant time. IEEE Trans Image Process 16(9):2389–2394 10. Gutenev A, From isotropic filtering to intensity constrained flat kernel filtering scheme. IEEE Trans Image Process (submitted for publication) 11. van Droogenbroeck M, Talbot H (1996) Fast computation of morphological operations with arbitrary structural element. Patt Recog Lett 17:1451–1460 12. Lee JS (1983) Digital image smoothing and the sigma filter. Comp Vis Graph Image Proc 24(2):255–269 13. Lukac R et al (2003) Angular multichannel sigma filter. In: Proceedings. (ICASSP ‘03) IEEE international conference on acoustics, speech, and signal processing, vol 3, pp 745–748 14. Pehamberger H, Binder M, Steiner A, Wolff K (1993) In vivo epiluminescence microscopy: improvement of early diagnosis of melanoma. J Invest Dermatol 100:356S–362S 15. Menzies SW, Ingvar C, McCarthy WH (1996) A sensitivity and specificity analysis of the surface microscopy features of invasive melanoma. Melanoma Res 6:55–62 16. Kramer HP, Bruckner JB (1975) Iterations of a nonlinear transformation for enhancement of digital images. Pattern Recogn 7:53–58 17. Gutenev AA (2010) Intensity constrained flat kernel image filtering scheme—definition and applications. Lecture notes in engineering and computer science: Proceedings of the World Congress on Engineering, WCE 2010, vol I, 30 June–2 July, London, UK, pp 641–645

Chapter 14

Convolutive Blind Separation of Speech Mixtures Using Auditory-Based Subband Model Sid-Ahmed Selouani, Yasmina Benabderrahmane, Abderraouf Ben Salem, Habib Hamam and Douglas O’Shaughnessy

Abstract A new blind speech separation (BSS) method of convolutive mixtures is presented. This method uses a sample-by-sample algorithm to perform the subband decomposition by mimicking the processing performed by the human ear. The unknown source signals are separated by maximizing the entropy of a transformed set of signal mixtures through the use of a gradient ascent algorithm. Experimental results show the efficiency of the proposed approach in terms of signal-tointerference ratio (SIR) and perceptual evaluation of speech quality (PESQ) criteria. Compared to the fullband method that uses the Infomax algorithm and to the convolutive fast independent component analysis (C-FICA), our method achieves a better PESQ score and shows an important improvement of SIR for different locations of sensor inputs.

S.-A. Selouani (&) Université de Moncton, Shippagan campus, Shippagan, NB E8S 1P6, Canada e-mail: [email protected] Y. Benabderrahmane D. O’Shaughnessy INRS-EMT, Université du Québec, Montreal, H5A 1K6, Canada e-mail: [email protected] D. O’Shaughnessy e-mail: [email protected] A. B. Salem H. Hamam Université de Moncton, Moncton, E1A 3E9, Canada e-mail: [email protected] H. Hamam e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_14, Ó Springer Science+Business Media B.V. 2011

161

162

S.-A. Selouani et al.

14.1 Introduction The practical goal of blind source separation (BSS) techniques is to extract the original source signals from their mixtures and possibly to estimate the unknown mixing channel using only the information from the observed signal with no, or very limited, knowledge about the source signals and the mixing channel. For several years, the separation of sources has been a particularly active research topic [19]. This interest can be explained by the wide spectrum of possible applications, which includes telecommunications, acoustics, seismology, location and tracking targets of radar and sonar, separation of speakers (so-called ‘cocktail party problem’), detection and separation in communication systems for multiple access, etc. Methods to solve the BSS problem can be divided into methods using second-order [5] or higher-order statistics [8], the maximum likelihood principle [3], principle component analysis (PCA) and non-linear PCA [13], and independent component analysis (ICA) methods [11, 14, 18]. Another important category of methods is the subband BSS. Subband BSS has many advantages compared to the other frequencydomain BSS approaches regarding the well-known permutation ambiguity of frequency bins [1]. In fact, the subband BSS permutation problem is quite less critical since the number of subbands that could be permuted is obviously smaller than the number of frequency bins. In addition, using a decimation process for each subband can considerably reduce the computational load when compared to timedomain approaches (which could be a computationally demanding task in the case of real-room mixtures). In [2], the subband analysis/synthesis system uses a polyphase filterbank with oversampling and single side band modulation. In low frequency bands, longer unmixing filters with overlap-blockshift are used. In [15], the subband analysis filterbank is basically implemented as a cosine-modulated prototype filter. The latter is designed as a truncated sinc() function weighted by a Hamming window. In [20], the impulse responses of the synthesis filters are based on the extended lapped transform and are defined by using the cosine modulation function. In the approach reported in [18], analysis filters are obtained by a generalized discrete Fourier transform. Analysis and synthesis filters are derived from a unique prototype filter which can be designed by iterative least-squares algorithm with a cost function including a stopband attenuation. In the blind speech separation approach we propose, we separate mixed sources that are assumed to be statistically independent, without any a priori knowledge about original source signals sj ðnÞ; j 2 f1; . . .; Ng but using only observations xi ðnÞ; i 2 f1; . . .; Mg through M sensors. Such signals are instantaneously or convolutively mixed. In this work, we are concerned with the convolutive case, i.e. the blind separation of convolved sources of speech, where source signals are filtered by impulse responses hij ðnÞ; from source j to sensor i: We are interested by the indiscriminate approach of separation that offers the advantage of not being reliant on major assumptions on the mix: besides its overall structure, often assumed linear, no settings are supposed known. Mixtures in that case can be expressed under a vector notation as

14

Convolutive Blind Separation of Speech Mixtures

XðnÞ ¼

1 X

HðkÞSðn kÞ;

163

ð14:1Þ

k¼0

where XðnÞ ¼ ½x1 ðnÞ; . . .; xM ðnÞT is a vector of mixtures, SðnÞ ¼ ½s1 ðnÞ; . . .; sN ðnÞT is a vector of speech sources, and HðkÞ ¼ ½hij ðkÞ; ði; jÞ 2 f1; . . .; Mg f1; . . .; Ng is a matrix of FIR filters. To blindly estimate the sources, an unmixing process is carried out, and the estimated sources YðnÞ ¼ ½y1 ðnÞ; . . .; yN ðnÞT can be written as L1 X YðnÞ ¼ WðkÞSðn kÞ; ð14:2Þ k¼0

where WðkÞ ¼ ½wij ðkÞ; ði; jÞ 2 f1; . . .; Mg f1; . . .; Ng is the unmixing matrix linking the jth output yj ðnÞ with the ith mixture xi ðnÞ: Such matrix is composed of FIR filters of length L: Each element is defined by the vectors wij ðkÞ ¼ ½wij ð0Þ; . . .; wij ðL 1Þ; 8ði; jÞ 2 f1; . . .; Mg f1; . . .; Ng: To mitigate problems in both time and frequency domains, in next sections, a new framework for the BSS of convolutive mixtures based on subband decomposition using an ear-model based filterbank and information maximization algorithm is presented and evaluated. This chapter includes an extension of our previous work [6]. A new evaluation criteria is introduced, namely the PESQ, and a new set of experiments are carried out involving the well-known C-FICA method in different mixing conditions.

14.2 Proposed Method In this section, we introduce the convolutive mixture based on the head related transfer function (HRTF) that we used to evaluate the proposed method. Then we define the subband decomposition using the modeling of the mid-external ear and the basilar membrane that aims at mimicking the human auditory system (HAS). Afterwards, the learning rule performing the sources’ separation is introduced.

14.2.1 HRTF Mixing Model The perception of the acoustic environment or room effect is a complex phenomenon linked mainly to the multiple reflections, attenuation, diffraction and scattering on the constituent elements of the physical environment around the sound source that the acoustic wave undergoes in its propagation from source to ear. These phenomena can be modeled by filters representing diffraction, scattering and reflection that a sound wave sustains during its travel between its source and the entrance of the ear canal of the listener. These filters are commonly called the head related transfer function or HRTF [10]. The principle of measuring HRTF is to place microphones in the ears and record the signals corresponding to different source positions. The HRTF is the

164

S.-A. Selouani et al.

transfer function between the source signals and the signals at the ears. The HRTF is then considered as a linear and time-invariant system. Each HRTF is represented by an FIR filter (finite impulse response), causal and stable. In our experiments, sources are convoluted with impulse responses modeling the HRTF. We tested our overall framework with mixing filters measured at the ears of a dummy head. We selected impulse responses associated with source positions defined by various angle values in relation to the dummy head (see Fig. 14.4).

14.2.2 Subband Decomposition The proposed modeling of HAS consists of three parts that simulate the behavior of the mid-external ear, the inner ear and the hair-cells and fibers. The external and middle ear are modeled using a bandpass filter that can be adjusted to signal energy to take into account the various adaptive motions of ossicles. The model of inner ear simulates the behavior of the basilar membrane (BM) that acts substantially as a non-linear filter bank. Due to the variability of its stiffness, different places along the BM are sensitive to sounds with different spectral content. In particular, the BM is stiff and thin at the base, but less rigid and more sensitive to low frequency signals at the apex. Each location along the BM has a characteristic frequency, at which it vibrates maximally for a given input sound. This behavior is simulated in the model by a cascade filter bank. The number of filterbank depends on the sampling rate of the signals and on other parameters of the model such as the overlapping factor of the bands of the filters, or the quality factor of the resonant part of the filters. The final part of the model deals with the electromechanical transduction of hair-cells and afferent fibers and the encoding at the level of the synaptic endings [7, 21]. 14.2.2.1 Mid-External Ear The mid-external ear is modeled using a bandpass filter. For a mixture input xi ðkÞ; the recurrent formula of this filter is given by ð14:3Þ x0i ðkÞ ¼ xi ðkÞ xi ðk 1Þ þ a1 x0i ðk 1Þ a2 x0i ðk 2Þ; 0 where xi ðkÞ is the filtered output, k ¼ 1; . . .; K is the time index and K is the number of samples in a given block. The coefficients a1 and a2 depend on the sampling frequency Fs ; the central frequency of the filter and its Q-factor. 14.2.2.2 Mathematical Model of the Basilar Membrane After each frame is transformed by the mid-external filter, it is passed to the cochlear filter banks whose frequency responses simulate those of the BM for an auditory stimulus in the outer ear. The formula of the model is as follows:

14

Convolutive Blind Separation of Speech Mixtures

x00i ðkÞ ¼ b1;i x00i ðk 1Þ b2;i x00i ðk 2Þ þ Gi ½x0i ðkÞ x0i ðk 2Þ;

165

ð14:4Þ

and its transfer function can be written as: Hi ðzÞ ¼

Gi ð1 z2 Þ ; 1 b1;i z1 þ b2;i z2

ð14:5Þ

where x00i ðkÞ is the BM displacement which represents the vibration magnitude at position di and constitutes the BM response to a mid-external sound stimulus x0i ðkÞ: The parameters Gi ; b1;i and b2;i ; respectively the gain and coefficients of filter or channel i; are functions of the position di along the BM. Nc cochlear filters are used to realize the model. These filters are characterized by the overlapping of their bands and a large bandwidth. The BM has a length of 35 mm which is approximately the case for humans [7]. Thus, each channel represents the state of an approximately D ¼ 1:46 mm of the BM. The sample-by-sample algorithm providing the outputs of the BM filters is given as follows.

166

S.-A. Selouani et al.

14.2.3 Learning Algorithm After performing the subband decomposition, the separation of convolved sources per subband is done by the Infomax algorithm. Infomax was developed by Bell and Sejnowski for the separation of instantaneous mixtures [4]. Its principle consists of maximizing output entropy or minimizing the mutual information between components of Y [23]. It is implemented by maximizing, with respect to W; the entropy of Z ¼ UðYÞ ¼ UðWXÞ: Thus, the Infomax contrast function is defined as CðWÞ ¼ HðUðWXÞÞ;

ð14:6Þ

where HðÞ is the differential entropy, which can be expressed as HðaÞ ¼ E½Lnðfa ðaÞÞ; where fa ðaÞ denotes the probability density function of a variable a: The generalization of Infomax for the convolutive case is performed by using a feedforward architecture. Both causal and non-causal FIR filters are performed in our experiments. With real-valued data for vector X; entropy maximization algorithm leads to the adaptation of unmixing filter coefficients with a stochastic gradient ascent rule using a learning steepest l: Then, the weights are defined as follows: Wð0Þ ¼ Wð0Þ þ lð½Wð0ÞT UðYðnÞÞXT ðnÞÞ;

ð14:7Þ

wij ðkÞ ¼ wij ðkÞ lUðyi ðnÞÞxj ðn kÞ;

ð14:8Þ

and, 8k 6¼ 0;

where Wð0Þ is a matrix composed of unmixing FIR filters coefficients as defined in Sect. 14.1, YðnÞ and XðnÞ are the separated sources and the observed mixtures, respectively. UðÞ is the score function of yi which is a non-linear function approximating the cumulative density function of sources, as defined in Eq. 14.9, where pðyi Þ denotes the probability density function of yi Uðyi ðnÞÞ ¼

dpðyi ðnÞÞ dyi ðnÞ

pðyi ðnÞÞ

:

ð14:9Þ

The block diagram of the proposed method is given in Fig. 14.1. The input signals, that are the set of mixtures, are firstly processed by the mid-external ear introduced by Eq. 14.3. Then outputs are passed through a filterbank representing the cochlear part of the ear. A decimation process is then performed for each subband output. Such decimation is useful for many reasons. First, it improves the convergence speed because input signals are more whitened than the time domain approach. Second, the wanted unmixing filter length will be reduced by a factor of 1 M ; where M is the decimation factor. After performing decimation, we group a set of mixtures belonging to the same cochlear filter to be the input of the unmixing stage. The latter gives separated sources of each subband that are upsampled by a

14

Convolutive Blind Separation of Speech Mixtures

167

Fig. 14.1 The ear-based framework for the subband BSS of convolutive mixtures of speech

M factor. The same filter bank is used for the synthesis stage. The estimated sources are added from different synthesis stages.

14.3 Experiments and Results A set of nine different signals, consisting of speakers (three females and six males) reading sentences during approximately 30 s, was used throughout experiments. This speech signals were collected by Nion et al. [17]. The signals were downsampled to 8 kHz. The C-FICA algorithm (convolutive extension of Fast-ICA: independent component analysis) and the full-band Infomax algorithms are used as baseline systems for evaluation. The C-FICA algorithm proposed by Thomas et al. [22] consists of time-domain extensions of the fast-ICA algorithms developed by Hyvarinen et al. [11] for instantaneous mixtures. For an evaluation of the source contributions, C-FICA uses the criterion of least squares, whose optimization is carried out by a Wiener filtering process. The convolutive version of full-band Infomax introduced in Sect. 14.3 in the evaluation tests.

14.3.1 Evaluation Criteria To evaluate the performance of BSS methods, two objective measures were used namely the signal to interference ratio (SIR) and the perceptual evaluation of speech quality (PESQ). The SIR has been emphasized to be a most efficient criterion for several methods aiming at reducing the effects of interference [9]. The SIR is an important entity in communications engineering that indicates the quality of a speech signal between a transmitter and a receiver environment. It is selected as the criteria for optimization. This reliable measurement is defined by

168

S.-A. Selouani et al.

SIR ¼ 10 log10

jjstarget jj2 jjeinterf jj2

ð14:10Þ

;

where starget ðnÞ is an allowed deformation of the target source si ðnÞ; einterf ðnÞ is an allowed deformation of the sources which accounts for the interference of the unwanted sources. Those signals are derived from a decomposition of a given estimated source yi ðnÞ of a source si ðnÞ: The second measure used to evaluate the quality of source separation is the PESQ. The latter is normalized in ITU-T recommendation P.862 [12] and is generally used to evaluate speech enhancement systems [16]. Theoretically, the results can be mapped to relevant mean opinion scores (MOS) based on the degradation of the speech sample. The algorithm predicts subjective opinion scores for degraded speech samples. PESQ returns a score from 0.5 to 4.5. The higher scores suggest better quality. The code provided by Loizou in [16] is used in our experiments. In general, the reference signal indicates an original signal and the degraded signal indicates the same utterance pronounced by the same speaker as in the clean signal but submitted to diverse adverse conditions. In the PESQ algorithm, the reference and degraded signals are level-equalized to a standard listening level thanks to the pre-processing stage. The gain of the two signals may vary considerably, so it is a priori unknown. In the original PESQ algorithm, the gains of the reference, degraded and corrected signals are computed based on the root mean square values of band-passed-filtered (350–3,250 Hz) speech.

14.3.2 Discussion Different configurations of the subband analysis and synthesis stages as well as of the decimation factor have been tested. The number of subbands was fixed at 24. Through our experiments we observed that when we keep the whole number of subbands, the results were not satisfactory. In fact, we noticed that some subbands in high frequencies are not used, and therefore this causes distortions on the listened signals. However, as shown in Fig. 14.2, the best performance was achieved for Nc0 ¼ 24 and M ¼ 4: In addition to the use of causal FIR filters, we adapted unmixing stage weights for non-causal FIR by centering the L taps. From Fig. 14.2, we observe that causal FIR yields good results in SIR improvement when compared to non-causal one. Another set of experiments have been carried out to evaluate the performance in the presence of an additive noise in sensors. We used the signal-to-noise-ratio (SNR) which is defined in [9], by SNR ¼ 10 log10

jjstarget þ einterf jj2 jjenoise jj2

;

ð14:11Þ

14

Convolutive Blind Separation of Speech Mixtures

169

Fig. 14.2 SIR improvement for both causal and noncausal filters. We denote by Nc0 the number of filters that have been used among Nc filters and M is decimation factor

Fig. 14.3 SNR comparison between the subband and fullband methods

where enoise is an allowed deformation of the perturbating noise, starget and einterf were defined previously. Figure 14.3 shows the SNR improvement using our subband decomposition, comparing to the fullband method, i.e. Infomax algorithm in convolutive case.

170

S.-A. Selouani et al.

We have also compared the proposed method with the well-known C-FICA and fullband Infomax techniques using PESQ and SIR objective measures. Among the available data, we considered a two-input, two-output convolutive BSS problem. We mixed in convolution two speech signals pronounced by a man and a woman. We repeated this procedure with different couples of sentences. The average of evaluation measures (SIR and PESQ) were calculated. As illustrated in Fig. 14.4, we selected impulse responses associated with source positions defined by different angles in relation to the dummy head. As can be seen in Table 14.1, the proposed subband is efficient and has the additional advantage that the preprocessing step is not necessary. The method was also verified subjectively by listening to the original, mixed and separated signals. The PESQ scores confirm the superiority of the proposed method in terms of intelligibility and quality of separation when compared to baseline techniques. The best HRTF configuration Fig. 14.4 The convolutive model with source positions at 30 and 40° angles in relation to the dummy head

Table 14.1 SIR and PESQ of proposed subband BSS, C-FICA method and full-band BSS Angle (°) C-FICA Full-band BSS Proposed 10 10 10 60 20 50 20 120 30 80

PESQ

SIR

PESQ

SIR

PESQ

SIR

2.61 2.68 2.04 2.52 3.18 2.95 3.02 2.15 2.28 1.94

5.08 5.37 4.45 6.67 7.54 6.82 6.62 5.87 6.29 4.72

3.15 3.27 2.23 2.76 3.81 3.55 3.32 3.04 2.47 2.13

8.45 9.74 7.21 8.94 11.28 10.79 10.02 9.11 7.55 7.02

3.85 4.10 2.92 3.24 4.24 4.16 3.42 3.29 2.88 2.51

13.45 13.62 8.96 10.05 13.83 11.67 12.14 10.52 9.76 8.73

14

Convolutive Blind Separation of Speech Mixtures

171

was obtained for 20–50° angle of dummy head where a PESQ of 4.24 and a SIR of 13.83 dB were achieved.

14.4 Conclusion An ear-based subband BSS approach was proposed for the separation of convolutive mixtures of speech. The results showed that using a subband decomposition that mimics the human perception and using the Infomax algorithm yields better results than the fullband and C-FICA methods. Experimental results showed the high efficiency of the new method in improving the SNR of unmixed signals in the case of noisy sensors. It is worth noting that an important advantage of the proposed technique is that it uses a simple time-domain sample-by-sample algorithm to perform the decomposition and that it does not need pre-processing step.

References 1. Araki S, Makino S, Nishikawa T, Saruwarati H (2001) Fundamental limitation of frequency domain blind source separation for convolutive mixture of speech. In: IEEE-ICASSP conference, pp 2737–2740 2. Araki S, Makino S, Aichner R, Nishikawa T, Saruwatari H (2005) Subband-based blind separation for convolutive mixtures of speech. IEICE Trans Fundamentals E88-A(12):3593– 3603 3. Basak J, Amari S (1999) Blind separation of uniformly distributed signals: a general approach. IEEE Trans Neural Networks 10:1173–1185 4. Bell AJ, Sejnowski TJ (1995) An information maximization approach to blind separation and blind deconvolution. Neural Comput 7(6):1129–1159 5. Belouchrani A, Abed-Meraim K, Cardoso JF, Moulines E (1997) A blind source separation technique using second-order statistics. IEEE Trans Signal Process 45(2):434–444 6. Ben Salem A, Selouani SA, Hamam H (2010) Auditory-based subband blind source separation using sample-by-sample and Infomax algorithms. In: Lecture notes in engineering and computer science: proceedings of the World Congress on engineering, 2010, WCE 2010, 30 June–2 July, London, UK, pp 651–655 7. Caelen J (1985) Space/time data-information in the A.R.I.A.L. project ear model. Speech Commun J 4:163–179 8. Cardoso JF (1989) Source separation using higher order moments. In: Proceedings IEEE ICASSP, Glasgow, UK, vol 4, pp 2109–2112 9. Fevotte C, Gribonval R, Vincent E (2005) BSS_EVAL toolbox user guide. IRISA, Rennes, France, Technical Report 1706 [Online]. Available: http://www.irisa.fr/metiss/bss_eval 10. Gardner B, Martin K Head related transfer functions of a dummy head [Online]. Available: http://www.sound.media.mit.edu/ica-bench/ 11. Hyvärinen A, Karhunen J, Oja E (2001) Independent component analysis. Wiley, New York 12. ITU (2000) Perceptual evaluation of speech quality (PESQ), and objective method for end-toend speech quality assessment of narrowband telephone networks and speech codecs. ITU-T Recommendation 862 13. Karhunen J, Joutsensalo J (1994) Representation and separation of signals using nonlinear PCA type learning. Neural Networks 7:113–127

172

S.-A. Selouani et al.

14. Kawaguchi A (2010) Statistical inference for independent component analysis based on polynomial spline model. In: IANG conference, vol I, IMECS 2010, Hong Kong 15. Kokkinakis K, Loizou PC (2007) Subband-based blind signal processing for source separation in convolutive mixtures of speech. In: IEEE-ICASSP conference, pp 917–920 16. Loizou P (2007) Speech enhancement: theory and practice. CRC Press LLC, Boca Raton 17. Nion D, Mokios KN, Sidiropoulos ND, Potamianos A (2010) Batch and adaptive PARAFACbased blind separation of convolutive speech mixtures. IEEE Trans Audio Speech Language Process 18(6):1193–1207 18. Park HM, Dhir CS, Oh SH, Lee SY (2006) A filter bank approach to independant component analysis for convolved mixtures. Neurocomputing 69:2065–2077 19. Pedersen MS, Larsen J, Kjems U, Parra LC (2007) A survey of convolutive blind source separation methods. Springer handbook on speech processing and speech communication. Springer, Berlin 20. Russel I, Xi J, Mertins A, Chicharo J (2004) Blind source separation of non-stationary convolutively mixed signals in the subband domain. In: IEEE-ICASSP conference, pp 481– 484 21. Tolba H, Selouani SA, O’Shaughnessy D (2002) Auditory-based acoustic distinctive features and spectral cues for automatic speech recognition using a multistream paradigm. In: IEEEICASSP conference 2002, pp 837–840 22. Thomas J, Deville Y, Hosseini S (2006) Time-domain fast fixed-point algorithms for convolutive ICA. IEEE Signal Process Lett 13(4):228–231 23. Wax M, Kailath T (1985) Detection of signals by information theoretic criteria. IEEE Trans Acoust Speech Signal Process 33(2):387–392

Chapter 15

Time Domain Features of Heart Sounds for Determining Mechanical Valve Thrombosis Sabri Altunkaya, Sadık Kara, Niyazi Görmüsß and Saadetdin Herdem

Abstract Thrombosis of implanted heart valve is a rare but lethal complication for patients with mechanical heart valve. Echocardiogram of mechanical heart valves is necessary to diagnose valve thrombosis definitely. Because of the difficulty in making early diagnosis of thrombosis, and the cost of diagnosis equipment and operators, improving noninvasive, cheap and simple methods to evaluate the functionality of mechanical heart valves are quite significant especially for first step medical center. Because of this, time domain features obtained from auscultation of heart sounds are proposed to evaluate mechanical heart valve thrombosis as a simple method in this chapter. For this aim, heart sounds of one patient with mechanical heart valve thrombosis and five patients with normally functioning mechanical heart valve were recorded. Time domain features of recorded heart sounds, the skewness and kurtosis, were calculated and statistically evaluated using paired and unpaired t-test. As a result, it is clearly seen that the skewness of first heart sound is the most discriminative features (p \ 0.01) and it

S. Altunkaya (&) S. Herdem Department of Electrical and Electronics Engineering, Selçuk University, 42075 Konya, Turkey e-mail: [email protected] S. Herdem e-mail: [email protected] S. Kara Biomedical Engineering Institute, Fatih University, 34500 Istanbul, Turkey e-mail: [email protected] N. Görmüsß Department of Cardiovascular Surgery, Meram Medical School of Selcuk University, 42080 Konya, Turkey e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_15, Ó Springer Science+Business Media B.V. 2011

173

174

S. Altunkaya et al.

may be used fairly well in differentiating normally functioning mechanical heart valve from malfunctioning mechanical heart valve.

15.1 Introduction Mechanical heart valve thrombosis is any thrombosis attached to a mechanical valve, occluding part of the blood flow or interfering with valvular function [1]. The mechanical heart valve thrombosis is a critical complication relating to the high mortality and requires immediate diagnosis and thrombolytic or surgical treatment [2]. Progression in the structure and design of mechanical heart valve over the years has led to a considerable improvement in their hemodynamic features and durability. However, thromboembolic complications remain a troublesome cause of postoperative morbidity and mortality [3, 4]. According to different literature; incidence of thromboembolic complication ranges from 0.03 to 4.3% patient-years [2], 0.5–6% per patient-year [3], 2–4% patients per annum [5] 0.5% patient-years [6] depending on the generation and the thrombosis of the prosthesis used, the location of the valve, and the quality of the anticoagulation [2]. Recently, transesophageal echocardiography has become the gold standard both in the early diagnosis of prosthetic valve thrombosis and in risk stratification for obstruction or embolism in patients with prosthetic heart valves [7]. However, it is quite expensive to use echocardiography for diagnosis of mechanical heart valve thrombosis in the first step medical center because of both a specialist requirement to use echocardiography and cost of these devices. So it may cause misdiagnosis of thrombotic complications in the first step medical center. Therefore, improving noninvasive, cheap and simple methods to evaluate the functionality of the mechanical heart valve are quite important [8–11]. There are limited numbers of study to evaluate mechanical heart valve thrombosis using heart sounds. Although there are limited numbers of studies about frequency spectrum of mechanical heart valve sounds, it is known that thrombosis formation on a prosthetic heart valve changes the frequency spectrum of both biological and mechanical heart valve. Features obtained from frequency and time–frequency analysis of heart sounds are used to detect mechanical heart valve thrombosis in the past studies. In these past studies, generally modified forward–backward Proony’s Method was used to detect frequency component of prosthetic heart valve [8–10, 12, 13]. In this chapter, time domain features instead of frequency domain features are proposed to evaluate thrombosis on the mechanical heart valve. For this aim, heart sounds of one patient with thrombosis and heart sounds of five patients with normally functioning mechanical heart valve are recorded. The skewness and kurtosis of heart sounds as time domain features were found and statistically evaluated using t-test.

15

Time Domain Features of Heart Sounds

175

Table 15.1 Clinical information of patients Pat. no Sex Age Valve size (mm)

Valve type

Condition

1 2 3 4 5 6

Sorin Sorin Sorin St.Jude St.Jude Sorin

Normal Normal Normal Normal Normal With thrombosis

F F F F F F

58 27 35 30 45 55

25 29 29 29 29 29

15.2 Patients and Data Acquisition This study includes patients who were operated in the Department of Cardiovascular Surgery, Meram Medical School of Selcuk University. Five patients with normally functioning mechanical heart valve and one patient with mechanical valve thrombosis were selected to evaluate the mechanical heart valve thrombosis using heart sounds. The heart sounds of a patient with thrombosis were recorded before and after thrombolytic treatment. Functionality of the mechanical heart valve of patients was investigated using echocardiography by the physician. After echocardiography investigation, thrombus with partial obstruction was monitored on the mitral mechanical heart valve of a patient. The heart sounds of five patients with normally functioning mechanical heart valve were recorded after the heart valve replacement in one year. The heart sounds of patients recorded from mitral area (intersection of left 5. intercostals interval and mid clavicular line) over the entire course of 30 s [14]. All patients had mitral valve replacement and clinical information of these patients is shown on Table 15.1. ECG signals were recorded simultaneously with heart sounds to segment first and second heart sounds. E-Scope II electronic stethoscope manufactured by Cardionics was used to record heart sounds. Sound signals obtained from electronic stethoscope and ECG signals obtained from the surface electrode were digitized at a 5000 Hz sampling frequency using the Biopac MP35 data acquisition device.

15.3 Extraction of First Heart Sounds (S1) and Second Heart Sounds (S2) As mentioned in the previous section, 30 s heart sounds are recorded from each patient. One S1 and one S2 sound component available in the heart sounds signal for one heart beat. In this chapter, detection of S1 and S2 that is varying number according to the number of pulse are discussed. Known that, the S1 occurs after the onset of the QRS complex, the S2 occurs towards the end of the T wave of ECG. Using these two relations between heart sounds and ECG, S1 and S2 sounds obtained from 30 s record. Processing of recorded heart sounds signal can be

176

S. Altunkaya et al.

summarized as follows. Firstly, filtration and normalization of recorded heart sounds and ECG signal is performed. After that, QRS and T peak of ECG signal is detected. Finally, S1 and S2 sounds are detected using QRS and T peak [14].

15.3.1 Preprocessing of ECG and Heart Sounds Signals All recorded heart sounds were filtered with a 30 Hz high pass and 2000 Hz low pass digital finite impulse response filter to get rid of noise and were normalized using HSnorm ðnÞ ¼

HSðnÞ maxjHSðnÞj

ð15:1Þ

where HS(n) is the row heart sound signal and HSnorm(n) is the normalized heart sounds signal. Also, a normalizing process was applied to the ECG signal.

15.3.2 Detection of QRS Complex and T Peak QRS complex of the ECG signal are detected using a first-derivative based QRS detection algorithm. In this algorithm, the ECG signal is first band pass filtered with a pass band of 10–20 Hz to eliminate the baseline wander and high frequency noise. After filtering, the ECG signal is differentiated to obtain QRS complex slope information, is squared point by point to clarify the QRS complex in the ECG signal, and then is time-averaged by taking the mean of the previous 10 points. The timeaveraged ECG signal is compared to a threshold to obtain the QRS complex [15, 16]. The threshold is chosen to be a quarter of the maximum time averaged ECG signal. As a result of this comparison, the maximum of the time-averaged ECG signals greater than the threshold is accepted as R peak. After that, all intervals between consecutive R peaks (RR interval) are compared to 0.5 and 1.5 times the mean RR interval. If the RR interval is longer than 1.5 times of the mean RR interval and is shorter than 0.5 times the mean RR interval, then this RR interval and its counterpart of heart sounds is removed from a signal to prevent wrong detection of RR interval. T waves are detected using the physiological knowledge that the peak in the T wave occurs at least 60 ms after the R peak and is normally within the two-thirds of the RR interval. The maximum of ECG signal in these interval is used as a T peak to detect the location of S2 [17].

15.3.3 Detection of S1 and S2 The ECG signal and Shannon energy is used to detection of the heart sounds. The ECG signal is used as a time reference to determine the time interval

15

Time Domain Features of Heart Sounds

177

Fig. 15.1 ECG, heart sounds and Shannon energy of one patient with MVR

where S1 and S2 are searched over one heart cycle. The Shannon energy of heart sounds is used for exact determination of location of S1 and S2 in the finite interval. The Shannon energy of the normalized heart valve sound (HSnorm) can be calculated using SE ¼

N 1X HS2 ðnÞ log HS2norm ðnÞ N n¼1 norm

ð15:2Þ

where SE is Shanonn energy of HSnorm, N is length of recorded data and n is index of HSnorm [18]. After the Shannon energy of heart sounds is calculated, to determine exactly the location of S1 in the RR interval, the maximum point of the Shannon energy in the interval between 0 01 RR and 0.2 RR is accepted the center of the S1. The maximum point of Shannon’s energy between the times the ECG T peaks to the ECG T peak time plus 150 ms is accepted as the center of S2. The duration of S1 and S2 was chosen to be 150 and 75 ms respectively on both sides of the center (Fig. 15.1). If the Shannon energy of the right or left side of the center is larger than 40% of the maximum Shannon energy, the duration of chosen heart sounds is increased by 20%. The comparison is repeated until the Shannon energy of the right or left side of the center is smaller than 40% the maximum Shannon energy [17, 19]. In Fig. 15.1, the upper graph shows the RR interval of the ECG signal, the 0.3–0.65 RR interval, and between the times the ECG T peaks to the ECG T peak time plus 150 ms, the middle graph shows the heart sounds signal, and the bottom graph shows the calculated Shannon energy.

178

S. Altunkaya et al.

Table 15.2 Mean ± Standard deviation (std.) of the skewness and kurtosis of heart sounds Thr (mean ± std.) AThr (mean ± std.) N (mean ± std.) Skewness of S1 Skewness of S2 Kurtosis of S1 Kurtosis of S2

0.96±0.36 0.71±0.7 5.24±1.11 8.39±2.47

0.18±0.25 0.3±0.36 4.34±0.45 5.65±1.56

0.12±0.42 -0.2±0.54 5.9±1.75 5.79±1.92

15.4 Skewness and Kurtosis The change in the signal or distribution of the signal segments is measured in terms of the skewness and kurtosis. The skewness characterizes the degree of asymmetry of a distribution around its mean. The skewness is defined for a real signal as Skewness ¼

Eðx lÞ3 r3

ð15:3Þ

where l are the mean and r are the standard deviation and E denoting statistical expectation. The skewness shows that the data are unsymmetrically distributed around a mean. If the distribution is more to the right of the mean point the skewness is negative. If the distribution is more to the left of the mean point the skewness is positive. The skewness is zero for a symmetric distribution. The kurtosis measures the relative peakedness or flatness of a distribution. The kurtosis for a real signal x(n) is calculated using Kurtosis ¼

Eðx lÞ4 r4

ð15:4Þ

where l are the mean and r are the standard deviation and E denotes statistical expectation. For symmetric unimodal distributions, the kurtosis is higher than 3 indicates heavy tails and peakedness relative to the normal distribution. The kurtosis is lower than 3 indicates light tails and flatness [20, 21].

15.5 Result and Discussion There are approximately thirty-first heart sounds (S1) and 30 s heart sounds (S2) component in 30 s recording of heart sound of each patient. The skewness and kurtosis of this entire S1 and S2 component were calculated for all recorded heart sounds. Table 15.2 shows mean and standard deviation (std.) of the skewness and kurtosis of the heart sounds of one patient with mechanical heart valve thrombosis (Thr), the heart sounds of the same patients after thrombolytic treatment (AThr) and the heart sounds of five patients with normally functioning mechanical heart valve (N). Figure 15.2 illustrates the summary statistics for the skewness and kurtosis of S1 and S2 of Thr, AThr and N.

15

Time Domain Features of Heart Sounds

179

Fig. 15.2 Box plot for the skewness of S1, the skewness of S2, the kurtosis of S2, the kurtosis S1 (Thr: patient with mechanical valve thrombosis, AThr: patient after thrombolytic treatment and N: five patients with normally functioning mechanical valve)

From Table 15.3 and Fig. 15.2, it can be said that there are a meaningful differences between means of the skewness of S1 and S2 and S2 of the kurtosis of heart sounds of patients with normally and malfunctioning mechanical heart valves. The kurtosis of S1 has the similar mean for these heart sounds. However, it is clearly seen that the skewness of S1 is the best feature to show difference between the normally and malfunctioning mechanical heart valve. Paired t-test with 99% confidence level was used for comparison means of the skewness and kurtosis between heart sounds of a patient before and after thrombolytic treatment was administered. Unpaired t-test with 99% confidence was used for comparison means of the skewness and kurtosis between patients with normally functioning mechanical heart valve and patient with mechanical heart valve thrombosis (before treatment). These tests were applied to two features, the skewness and kurtosis, obtained from two sound components S1 and S2. p value obtained from above tests is shown on Table 15.3. The skewness of S2 only between Thr and N, the kurtosis of S1 only between Thr and AThr, the kurtosis of S2 only between Thr and N shows statistically significance differences (p \ 0.01). But the skewness of S1 shows statistically

180

S. Altunkaya et al.

Table 15.3 p value obtained from paired and unpaired t-test Between heart sounds of Thr and AThr (paired t-test)

Between heart sounds of Thr and N (unpaired t-test)

Skewness of S1 Skewness of S2 Kurtosis of S1 Kurtosis of S2

0.000006 0.0011 0.0712 0.0924

0.0000018 0.0199 0.0046 0.0016

Thr Patient with mechanical heart valve thrombosis AThr Patient with mechanical heart valve thrombosis after thrombolytic treatment N Five patients with normally functioning mechanical heart valve

significance differences both between Thr and AThr and between Thr and N (p \ 0.01). Because of this, the skewness of S1 is the best feature to distinguish heart sounds of a patient with mechanical valve thrombosis and normally functioning mechanical heart valve.

15.6 Conclusion and Future Work In this chapter, the skewness and kurtosis of heart sounds of patients with mechanical heart valve thrombosis and normally functioning mechanical heart valve were compared statistically. As a result, the skewness of S1 of mechanical heart valve should perform fairly well in differentiating normally functioning and malfunctioning mechanical heart valve. However, effectiveness of the skewness of S1 to detect malfunctioning mechanical heart valve should be proven with a large patient population. After that, the skewness of S1 of mechanical heart sound signals may be used for analysis of mechanical heart valve sounds with a view to detecting thrombosis formation on mechanical heart valve. Acknowledgments This work was supported by scientific research projects (BAP) coordinating office of Selçuk University.

References 1. Edmunds LH, Clark RE, Cohn LH, Grunkemeier GL, Miller DC, Weisel RD (1996) Guidelines for reporting morbidity and mortality after cardiac valvular operations. J Thorac Cardiovasc Surg 112:708–711 2. Roudaut R, Lafitte S, Roudaut MF, Courtault C, Perron JM, Jaı C et al (2003) Fibrinolysis of mechanical prosthetic valve thrombosis. J Am Coll Cardiol 41(4):653–658 3. Caceres-Loriga FM, Perez-Lopez H, Santos-Gracia J, Morlans-Hernandez K (2006) Prosthetic heart valve thrombosis: pathogenesis, diagnosis and management. Int J Cardiol 110:1–6 4. Roscıtano A, Capuano F, Tonellı E, Sınatra R (2005) Acute dysfunction from thrombosis of mechanical mitral valve prosthesis. Braz J Cardiovasc Surg 20(1):88–90

15

Time Domain Features of Heart Sounds

181

5. Schlitt A, Hauroeder B, Buerke M, Peetz D, Victor A, Hundt F, Bickel C et al (2002) Effects of combined therapy of clopidogrel and aspirin in preventing thrombosis formation on mechanical heart valves in an ex vivo rabbit model. Thromb Res 107:39–43 6. Koller PT, Arom KV (1995) Thrombolytic therapy of left-sided prosthetic valve thrombosis. Chest 108:1683–1689 7. Kaymaz C, Özdemir N, Çevik C, Izgi C, Özveren O, Kaynak E et al (2003) Effect of paravalvular mitral regurgitation on left atrial thrombosis formation in patients with mechanical mitral valves. Am J Cardiol 92:102–105 8. Kim SH, Lee HJ, Huh JM, Chang BC (1998) Spectral analysis of heart valve sound for detection of prosthetic heart valve diseases. Yonsei Med J 39(4):302–308 9. Kim SH, Chang BC, Tack G, Huh JM, Kang MS, Cho BK, Park YH (1994) In vitro sound spectral analysis of prosthetic heart valves by mock circulatory system. Yonsei Med J 35(3):271–278 10. Candy JV, ve Meyer AW (2001) Processing of prosthetic heart valve sounds from anechois tank measurements. 8. International Congress on Sound and Vibration. China 11. Grigioni M, Daniele C, Gaudio CD, Morbiducci U, D’avenio G, Meo DD, Barbaro V (2007) Beat to beat analysis of mechanical heart valves by means of return map. J Med Eng Technol 31(2):94–100 12. Sava HP, Bedi R, McDonnell TE (1995) Spectral analysis of carpentier-edwards prosthetic heart valve sounds in the aortic position using svd-based methods. Signal Process Cardiogr IEE Colloq 6:1–4 13. Sava HP, McDonnell JTE (1996) Spectral composition of heart sounds before and after mechanical heart valve imdantation using a modified forward-backwar d Prony’s method. IEEE Trans Biomed Eng 43(7):734–742 14. Altunkaya S, Kara S, Görmüsß N, Herdem S (2010) Statistically evaluation of mechanical heart valve thrombosis using heart sounds. Lecture notes in engineering and computer science: Proceedings of the World Congress on Engineering 2010, WCE 2010, 30 June–2 July, 2010, London, UK, 704–708 15. Pan J, Tompkins WJ (1985) A real-time QRS detection algorithm. IEEE Trans Biomed Eng 32(3):230–236 16. Köhler BU, Hennig C, Orglmeister R (2002) The principles of software QRS detection. IEEE Eng Med Biol Mag 21(2):42–57 17. Syed Z, Leeds D, Curtis D, Nesta F, Levine RA, Guttag J (2007) A framework for the analysis of acoustical cardiac signals. IEEE Trans Biomed Eng 54(4):651–662 18. Choi S, Jiang Z (2008) Comparison of envelope extraction algorithms for cardiac sound signal segmentation. Expert Syst Appl 34(2):1056–1069 19. El-Segaier M, Lilja O, Lukkarinen S, Ornmo LS, Sepponen R, Pesonen E (2005) Computerbased detection and analysis of heart sound and murmur. Ann Biomed Eng 33(7):937–942 20. Sanei S, Chambers JA (2007) EEG signal processing. Wiley, Chichester, pp 50–52 21. Press WH, Flannery BP, Teukolsky SA, Vetterling WT (1993) Numerical recipes in C: the art of scientific computing. Cambridge University Press, Cambridge, pp 610–615

Chapter 16

On the Implementation of Dependable Real-Time Systems with Non-Preemptive EDF Michael Short

Abstract Non-preemptive schedulers remain a very popular choice for practitioners of resource constrained real-time embedded systems. This chapter is concerned with the non-preemptive version of the Earliest Deadline First algorithm (npEDF). Although several key results indicate that npEDF should be considered a viable choice for use in resource-constrained real-time systems, these systems have traditionally been implemented using static, table-driven approaches such as the ‘cyclic executive’. This is perhaps due to several popular misconceptions regarding the basic operation, optimality and robustness of this algorithm. This chapter will attempt to redress this balance by showing that many of the supposed ‘problems’ attributed to npEDF can be easily overcome by adopting appropriate implementation and analysis techniques. Examples are given to highlight the fact that npEDF generally outperforms other non-preemptive software architectures when scheduling periodic and sporadic tasks. The chapter concludes with the observation that npEDF should in fact be considered as the algorithm of choice for such systems.

16.1 Introduction This chapter is concerned with the non-preemptive scheduling of recurring (periodic/sporadic) task models, with applications to resource-constrained, singleprocessor real-time and embedded systems. In particular, it is concerned with scheduler architectures, consisting of a small amount of hardware (typically a timer/interrupt controller) and associated software. In this context, there are two M. Short (&) Electronics & Control Group, Teesside University, Middlesbrough, TS1 3BA, UK e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_16, Springer Science+Business Media B.V. 2011

183

184

M. Short

Fig. 16.1 Aspects of real-time embedded scheduling

main requirements of a scheduler. The first is task activation, which is the process of deciding at which points in time tasks become ready for execution. Periodic tasks are normally activated via a timer; event driven (sporadic) tasks can be either directly activated by interrupts or by polling an interrupt status flag. The second is task dispatching, which is the process of deciding which of the active tasks is best to execute, and some form of scheduling algorithm is normally required to achieve this. These two main aspects of scheduling are illustrated in Fig. 16.1. The performance of scheduling algorithms and techniques is an area worthy of study; the seminal paper of Liu and Layland [1], published in 1973, spawned a multitude of research and a significant body of results can now be found in the literature. Liu and Layland were the first to discuss deadline-driven scheduling techniques in the context of real-time and embedded computing. It is known that when task preemption is allowed, this technique—also known as Earliest Deadline First (EDF)—allows the full utilization of the CPU, and is optimal on a single processor under a wide variety of different operating constraints [1–3]. However, for developers of systems with severe resource constraints, preemptive scheduling techniques may not be viable; the study of non-preemptive alternatives is justified for the following (non-exhaustive) list of reasons [4–7]: • Non-preemptive scheduling algorithms are easier to implement than their preemptive counterparts, and can exhibit dramatically lower runtime overheads; • Non-preemptive scheduling naturally guarantees exclusive access to resources, eliminating the need for complex resource access protocols; • Task sets under non-preemptive scheduling can share a common stack and processor context, leading to vastly reduced memory requirements; • Cache and pipeline-related flushing following task preemptions does not occur in non-preemptive systems; • Implementation of overload detection and recovery methods can be easier to implement; • Initial studies seem to indicate that non-preemptive systems are less susceptible to transient errors than their preemptive counterparts.

16

On the Implementation of Dependable Real-Time Systems

185

Along with these advantages, non-preemptive scheduling is also known to have several associated disadvantages; task response times are generally longer, eventdriven (sporadic) task executions are not as well supported, and when preemption is not allowed, many scheduling problems become either NP-Complete or NP-Hard [8]. This work is concerned with systems implementing the non-preemptive version of EDF (npEDF). The main motivating factors for the work are as follows. Although the treatment of npEDF has been (comparatively) small in the literature, several key results exist that indicate npEDF can overcome most (perhaps not all) of the problems associated with non-preemption; as such it should be considered as a viable choice for use in resource-constrained real-time and embedded systems. However, such systems have traditionally been implemented using static, table-driven approaches such as the ‘cyclic executive’ and its variants (see, for example, [4, 9–11]). This is perhaps due to several popular misconceptions1 with respect to the basic operation, implementation complexity, optimality and robustness of the npEDF algorithm, leading to a general lack of coverage in the wider academic community. This chapter will attempt to redress this balance by arguing the case for npEDF, and showing that the supposed ‘problems’ commonly attributed to it either simply do not hold, or can easily be overcome by adopting an appropriate implementation and by applying simple off-line analysis techniques. The chapter is organized as follows. Section 16.2 considers why npEDF seems to be ‘missing’ from most major texts on real-time systems. Section 16.3 presents the assumed task model, and gives a basic description of npEDF. This section also identifies and expands a list of its common criticisms. Section 16.4 subsequently addresses each of these criticisms in turn, to establish their validity (or otherwise). Section 16.5 concludes the chapter.

16.2 npEDF: A Missing Algorithm? In most of the major texts in the field of real-time systems, npEDF does not get more than a passing mention. For example, analysis of non-preemptive scheduling is typically restricted to the use of ‘cyclic executives’ or ‘timeline schedulers’. In almost all cases, after problems have been identified with such scheduling models, attention is then focused directly on Priority-Driven Preemptive (PDP) approaches as a ‘cure for all ills’. For example, Buttazzo [5] discusses timeline scheduling in C4 of his (generally) well-respected book on hard real-time computing systems, concluding with a list of problems associated with this type of scheduling. On p78—immediately before moving onto descriptions of PDP algorithms—it is stated that ‘‘The problems outlined above of timeline scheduling can be solved by using priority-based [preemptive] algorithms.’’ 1

The key results for npEDF—and their implications—are comparatively more difficult to interpret that for other types of scheduling; for example, many previous works assume the reader possesses an in-depth understanding of formal topics in computer science, such as computational complexity.

186

M. Short

Liu takes a similar approach in what is perhaps the most widely-acclaimed book in this area (Real-Time Systems) [12]. Cyclic scheduling is discussed in C5 of her book, ending with a list of associated problems on p122. In each case, it is stated that a PDP system can overcome the problem. This type of argument is by no means limited to reference texts. Burns et al. [9] describe (in-depth) some techniques that can be used for generating feasible cyclic or timeline schedules, followed by a discussion of the problems associated with this type of scheduling, directly followed by a final section (p160) discussing ‘‘Priority [-based preemptive] scheduling as an alternative to cyclic scheduling’’. Whilst it is clearly untrue to say these statements are false, as stated above PDP scheduling is not without its own problems; the next section will examine the basics of npEDF, and examine why it seems to have been overlooked.

16.3 Task Model and Preliminaries 16.3.1 Recurring Computational Tasks This work is concerned with the implementation of recurring/repeated computations on a single processor, such as those that may be required in signal processing and control applications. Such a system may be represented by a set s of n tasks, where each task ti [ s is represented by the tuple: ti ¼ ðpi ; ci ; di Þ

ð16:1Þ

In which pi is the task period (minimum inter-arrival time), ci is the (worst-case) computation requirement of the task and di is the task (relative) deadline. This model was introduced by Liu and Layland [1] and has since been widely adopted—see, for example, [2–7]. Note that it can be assumed w.l.o.g. that time is discrete, and all task parameters can be assumed to be integer [13]. Although implicit deadline tasks (i.e. those in which di = pi) are most commonly discussed in the literature (and employed in practice), no specific relationships between periods and deadlines are assumed to fully generalize the work. Note that periodic tasks may additionally be described by an addition parameter, an initial release time (or offset phasing) ri. Finally, the system utilization U represents the fraction of time the processor will be occupied processing the jobs in the task set over its P lifetime, and is defined as U = ci/pi.

16.3.2 npEDF Algorithm Operation The npEDF algorithm may be described, in simple terms, as follows: 1. When selecting a task for execution, the task with the earliest deadline is selected first (and then run to completion).

16

On the Implementation of Dependable Real-Time Systems

187

Fig. 16.2 Example schedule generated by npEDF

2. Ties between tasks with identical deadlines are broken by selecting the task with the lowest index. 3. Unless the processor is idle, scheduling decisions are only made at task boundaries. 4. When the scheduler is idle, the first task to be invoked is immediately executed (if multiple tasks are simultaneously invoked, the task with the earliest deadline is selected). This simple (but deceptively effective) algorithm may be implemented using only a single hardware timer for periodic tasks. The algorithm clearly differs from the static table-driven approaches, in that the schedule is built on-line, and there is therefore no concept of a fixed time ‘frame’ or ‘tick’. An example schedule which is built by npEDF for the set of synchronous tasks s = [(4, 1, 4), (6, 2, 6), (12, 3, 12)] is shown in Fig. 16.2.

16.3.2.1 npEDF: Common Criticisms As mentioned in the introduction, generally due to misconceptions (or misinterpretations) of its basic operation and use, npEDF is generally seen to be too problematic for use in real systems. The main criticisms that can be found in the literature are listed below: 1. 2. 3. 4.

npEDF is not an optimal non-preemptive scheduling algorithm; npEDF is difficult to analyze, and no efficient schedulability tests exists; npEDF is not ‘robust’ to changes in the task set parameters; Timer rollover can lead to anomalies and deadline misses in an otherwise schedulable task set; 5. The use of npEDF leads to increased overheads (and power consumption) compared to other non-preemptive scheduling techniques. Note that optimal in this sense refers to the ability of npEDF to build a valid schedule, if such a schedule exists. Additionally, robustness refers to the ability of a scheduling algorithm to tolerate run-time reductions in the execution requirement of one (or more) tasks (or, equivalently, increases in period) without deadline misses occurring in an otherwise schedulable task set. Please also note that apart from point 3, this list of criticisms is specific to npEDF, and therefore does not

188

M. Short

include the so-called ‘long-task’ problem which is endemic to all non-preemptive schedulers. This specific problem arises when one or more tasks have a deadline that is less than the execution time of another task. In this situation, effective solutions are known to include code-refactoring at the task level, employing statemachines, or alternately adopting the use of hybrid designs [4, 8, 14]. Such solution techniques easily generalize to npEDF, and are not discussed in any further depth here.

16.4 The Case for npEDF If all of the criticisms given in the previous section are based in fact, then npEDF does not seem a wise choice for system implementation; in fact the contrary would be true. This section will examine each point in greater detail, to investigate if, in fact, each specific claim actually holds.

16.4.1 npEDF is not Optimal As mentioned previously, optimal in this sense refers to the ability of a scheduling algorithm to build a valid schedule for an arbitrary set of feasible tasks, if such a schedule exists. Each (and every) proof that npEDF is sub-optimal relies on a counter-example of the form shown in Fig. 16.3 (taken from Liu [12]—a similar example appears in Buttazzo [5]). It can be seen that despite the existence of a feasible schedule, obtained via the use of a scheduler which inserts idle-time between t = 3 and t = 4 (indicated by the question marks in the figure), the schedule produced by npEDF misses a deadline at t = 12. Since the use of inserted idle-time can clearly have a beneficial effect with respect to meeting deadlines, a related question immediately arises: how complex is a scheduler that uses inserted idle time, and will such a scheduler be of practical use for a real system? The answer, unfortunately, is a resounding no. Two important results were formally shown by Howell and Venkatro [15]. The first is that there cannot be an optimal on-line algorithm using inserted idle-time for the non-preemptive scheduling of sporadic tasks; only non-idling scheduling strategies can be optimal. The second is that an on-line scheduling strategy that makes use of inserted idle-time to schedule non-preemptive periodic tasks cannot be efficiently implemented unless P = NP. It can thus be seen that inserted idle-time is not beneficial when scheduling sporadic tasks, and if efficiency is taken into account, then attention must be restricted to non-idling strategies when scheduling periodic tasks. Efficiency in this sense refers to the amount of time taken by the scheduler to make scheduling decisions; only schedulers that take time proportional to some polynomial in the task set parameters can be considered efficient (a scheduler which takes 50 years to decide the optimal strategy for the next 10 ms is not much

16

On the Implementation of Dependable Real-Time Systems

189

Fig. 16.3 npEDF misses a deadline, yet a feasible schedule exists

practical use). What is known about the non-idling scheduling strategies? These include, for example, npEDF, TTC scheduling [4, 14] and non-preemptive Rate Monotonic (npRM) scheduling [16]. npEDF is known to be optimal among this class of algorithms for scheduling recurring tasks; results in this area were known as early as 1955 [17]. The proof was demonstrated in the real-time context by Jeffay et al. [6] for the implicit deadline case, and extended by George et al. [18] to the constrained deadline case. Thus, the overall claim status: npEDF is sub-optimal for periodic tasks if and only if P = NP, and is optimal for sporadic tasks regardless of the equivalence (or otherwise) of these complexity classes.

16.4.2 No Efficient Schedulability Test Exist for npEDF Consider again the example shown in Fig. 16.3, in which the npEDF algorithm misses a deadline. Why is the deadline missed? At t = 3, only J2 is active and, since the scheduler is non-idling, it immediately begins execution of this task. Subsequently at t = 4, J3 is released and has an earlier deadline—but it is blocked (due to non-preemption) until J2 has run to completion at t = 9. This is known as a ‘priority inversion’ as the scheduler cannot change its mind, once committed. This is highlighted further in Fig. 16.4. Worst-case priority inversions under npEDF scheduling have been investigated in some detail. A relatively simple set of conditions for implicit deadline tasks was derived by Jeffay et al. [6], and were subsequently generalized by George et al. to the arbitrary deadline case [18]. They showed that a set of arbitrary-deadline periodic/sporadic tasks is schedulable under npEDF if and only if all deadlines are met over a specific analysis interval (of length L) following a synchronous arrival sequence of the tasks at t = 0, with the occurrence of worst-case blocking due to non-preemption simultaneously occurring. This situation is depicted in Fig. 16.5,

190

M. Short

Fig. 16.4 Priority inversion due to non-preemption

Fig. 16.5 Worst-case priority inversion induced by task i arriving at t = -1

showing the task with the largest execution time beginning execution one time unit prior to the simultaneous arrival of the other tasks. These conditions can be formalized to obtain a schedulability test, which is captured by the following conditions: U 1:0; 8t; 0\t\L; hbðtÞ t;

ð16:2Þ

Where: hbðtÞ ¼

i¼n X i¼1

t þ pi di max 0; ci þ maxdi [ t fci 1g pi

ð16:3Þ

And: ( L ¼ max d1 ; . . .; dn ;

Pi¼n i¼1

ðpi di Þ Ui 1U

) ð16:4Þ

It should be noted that the time complexity of an algorithm to decide Eqs. 16.2– 16.4 is pseudo-polynomial (and hence highly efficient) whenever U \ 1.0. Other upper bounds on the length on L are derived in [18]. The non-preemptive scheduling problem, in this formulation, turns out to be only weakly coNPComplete. When compared to feasibility tests for other non-preemptive scheduling disciplines, this is significantly better. For example, it is known that deciding if a

16

On the Implementation of Dependable Real-Time Systems

191

set of periodic process can be scheduled by a cyclic executive or timeline scheduler is strongly NP-Hard [8, 9]; it is also known that deciding if a set of periodic process can be scheduled by a TTC scheduler is strongly coNP-Hard2 [19]. Note that strong and weak complexity results have a precise technical meaning; specifically, amongst other things the former rules out the prospect of a pseudo-polynomial time algorithm unless P = NP, whereas the latter does not. Thus, although a very efficient algorithm may be formulated to exactly test for Eqs. 16.2–16.4, it is thought that no exact algorithm can ever be designed to efficiently test schedulability for these alternate scheduling policies. Overall claim status: npEDF admits an efficient feasibility test for periodic (sporadic) tasks that ensures even worst-case priority inversions do not lead to deadline misses.

16.4.3 npEDF is not Robust to Reductions in System Load With respect to this complaint, Jane Liu presents some convincing evidence on p. 73 of her book Real-Time Systems [12], and cites the seminal paper by Graham [20] investigating timing anomalies. There are two principal problems here. The paper by Graham deals only with the multiprocessor case; specifically, it investigates the effects of reduced (aperiodic) task execution times on the makespan produced by the LPT heuristic scheduling technique. As do the examples on p. 73 of Liu’s book, although it is not made explicitly clear. With respect to singleprocessor scheduling, these examples simply do not apply; the only single processor timing anomaly referred to in the Liu text is reproduced in Fig. 16.6; at first glance, it seems that a run-time reduction in the execution requirement of job C1 does, indeed, lead to a deadline miss of J3. However upon closer inspection, this example can be seen to be almost identical to the example given in Fig. 16.3, with the execution of J1 between t = 3 and t = 4 effectively serving the same purpose as the inserted idle-time in Fig. 16.3. In order for this example to hold up, it must logically follow that the schedule must be provably schedulable when the tasks have nominal parameters given by A); applying Eqs. 16.2–16.4 to these tasks, it can be quickly determined that the task set is not deemed to be schedulable, since the formulation of Jeffay’s feasibility test takes worst-case priority inversion into account. This example is misleading w.r.t. npEDF—since the task set simply fails the basic feasibility test, Liu’s argument of ‘an otherwise schedulable task set’ becomes a non-starter. This again highlights the fact that misconceptions regarding robustness and priority inversions have principally arisen from one simple fact; as shown in the previous section, the worst case behavior of a task set—its critical 2

In fact, this situation is known to considerably worse than this. The problem is actually known to be NPNP-Complete [19]. Under the assumption that P = NP, this means that the feasibility test requires an exponential number of calls to a decision procedure which is itself strongly coNPComplete.

192

M. Short

Fig. 16.6 Evidence for a lack of npEDF robustness?

Fig. 16.7 Converting from an absolute (left) to a modular (right) representation of time

instants—under non-preemptive scheduling is not the same as under preemptive scheduling. Overall claim status: If appropriate (off-line) analysis is performed to confirm the schedulability of a task set, this task set will remain schedulable under npEDF even under conditions of reduced system load.

16.4.4 Timer Rollover Can Lead to Deadline Misses With respect to this complaint, this can in fact be shown to hold, but is easily solved. The assumption that time is represented as integer—and in embedded systems, normally with a fixed number of bits (e.g. 16)—eventually leads to timer rollover problems; deadlines will naturally ‘wrap around’ due to the modular representation of time. Since the normal laws of arithmetic no longer hold, it cannot be guaranteed that di mod(2b) \ dj mod(2b) when di \ dj and time is represented by b-bit unsigned integers. There are several efficient techniques that may be used to overcome this problem, perhaps the most efficient is as follows. Assuming that the inequality pm \ 2b/2 holds over a given task set, i.e. the maximum period is less than half the linear life time of the underlying timer, then the rollover problem may be efficiently overcome by using Carlini and Buttazzo’s Implicit Circular Timer Overflow Handler (ICTOH) algorithm [21]. The algorithm has a very simple code implementation, and is show as C code in Fig. 16.7.

16

On the Implementation of Dependable Real-Time Systems

193

Fig. 16.8 Density of scheduling events for both TTC and npEDF scheduling

The algorithm’s operation exploits the fact that the modular distance between any two events (e.g. deadlines or activation times) x and y, encoded by b-bit unsigned integers, may be determined by performing a subtraction modulo 2b between x and y, with the result interpreted as a signed integer. Overall claim status: rollover is easily handled by employing algorithms such as ICTOH.

16.4.5 npEDF Use Leads to Large Scheduling Overheads In order to shed more light on this issue, let us consider the required number of ‘scheduling events’ over the hyperperiod (major cycle) of a given periodic task set, and also the complexity—the required CPU iterations, as a function of the task parameters—of each such event. Specifically, let us consider these scheduling events as required for task sets under both npEDF and TTC scheduling. TTC scheduling is considered as the baseline case in this respect, as it has previously been argued that a TTC scheduler provides a software architecture with minimal overheads and resource requirements [4, 7, 14]. With npEDF, one scheduling event is required for each and every task execution, and the scheduler enters idle mode when all pending tasks are executed. It can be woken by an interrupt set to match the earliest time at which a new task will be invoked. The TTC algorithm is designed to perform a scheduling event at regular intervals, in response to periodic timer interrupts; the period of these interrupts is normally set to be the greatest common divisor of the task periods [4, 14]. Let the major cycle h of a set of synchronous tasks be given by h = lcm(p1, p2, …, pn). The number of scheduling events occurring in h for both the TTC scheduler—SETTC—and the npEDF scheduler—SEEDF—are given by: SETTC ¼

X lcmðp1 ; p2 ; . . .; pn Þ lcmðp1 ; p2 ; . . .; pn Þ ; ; SEEDF ¼ gcdðp1 ; p2 ; . . .; pn Þ pi i2s

ð16:5Þ

Clearly SEEDF B SETTC in almost all cases, and an example to highlight this is shown for the task set s = [(90, 5), (100, 5)] in Fig. 16.8, where scheduling events

194

M. Short

Fig. 16.9 CPU overheads vs. number of tasks

are indicated by the presence of up-arrows on the timeline. Also of interest are the time complexities of each scheduling event. Given the design of the TTC scheduler, it is clear from its implementation (see, for example, [4, 14]) that its complexity is O(n). Task management in the npEDF scheduler significantly improves upon this situation; it is known that the algorithm can be implemented with complexity O(log n) or better, in some cases O(1) [22]. To further illustrate this final point, Fig. 16.9 shows a comparison of the overheads incurred per scheduling event as the number of tasks was increased on a 72-Mhz ARM7-TDMI microcontroller. Overhead execution times were extracted using the technique described in [22]. This graph clearly shows the advantages of the npEDF technique, and for n [ 8 the overheads become increasingly smaller. Overall claim status: With an appropriate implementation, the density of npEDF scheduling events is significantly better than competing methods; the CPU overheads incurred at each such event are also significantly lower.

16.5 Conclusions This chapter has considered the non-preemptive version of the Earliest Deadline First algorithm, and has investigated the supposed problems that have been attributed to this form of scheduling technique. Examples and analysis have been given to show that these problems are either baseless or trivially solved, and in most cases npEDF outperforms many other non-preemptive software architectures. As such, it the conclusion of the current chapter that npEDF should be considered

16

On the Implementation of Dependable Real-Time Systems

195

as one of the primary algorithms for implementing resource-constrained real-time and embedded systems. A preliminary version of the work described in this chapter was presented at the World Congress on Engineering, July 2010 [23].

References 1. Liu J, Layland J (1973) Scheduling algorithms for multiprogramming in a hard real-time environment. J ACM 20(1):46–61 2. Coffman E Jr (1976) Introduction to deterministic scheduling theory, in computer and jobshop scheduling theory. Wiley, New York 3. Dertouzos ML (1974) Control robotics: the procedural control of physical processes. Inf Process 74 4. Pont M (2001) Patterns for time-triggered embedded systems. ACM Press/Addison-Wesley Education, New York 5. Buttazzo GC (2005) Hard real-time computing systems: predictable scheduling algorithms and applications. Spinger, New York 6. Jeffay K, Stanat D, Martel C (1991) On non-preemptive scheduling of periodic and sporadic tasks. In: Proceedings of the IEEE Real-Time Systems Symposium 7. Short M, Pont M, Fang J (2008) Exploring the impact of pre-emption on dependability in time-triggered embedded systems: a pilot study. In: Proceedings of the 20th Euromicro Conference on Real-Time Systems (ECRTS 2008), Prague, Czech Republic, pp 83–91 8. Garey MR, Johnson DS (1979) Computers and intractability: a guide to the theory of NP-completeness. W.H. Freeman & Co Ltd, New York 9. Burns A, Hayes N, Richardson M (1994) Generating feasible cyclic schedules. Control Eng Pract 3(2):151–162 10. Baker TP, Shaw A (1989) The cyclic executive model and Ada. Real-Time Syst 1(1):7–25 11. Locke CD (1992) Software architecture for hard real-time applications, cyclic executives vs. fixed priority executives. Real-Time Syst 4(1):37–52 12. Liu JWS (2000) Real-time systems. Prentice-Hall, New Jersey 13. Baruah S, Rosier L, Howell R (1991) Algorithms and complexity concerning the preemptive scheduling of periodic, real-time tasks on one processor. Real-Time Syst 2(4):301–324 14. Gendy AK, Pont MJ (2008) Automatically configuring time-triggered schedulers for use with resource-constrained, single-processor embedded systems. IEEE Trans Ind Inform 4(1):37–45 15. Howell R, Venkatro M (1995) On non-preemptive scheduling of recurring tasks using inserted idle times. Inf Comput 117:50–62 16. Park M (2007) Non-preemptive fixed priority scheduling of hard real-time periodic tasks. Lect Notes Comput Sci 4990:881–888 17. Jackson JR (1955) Scheduling a production line to minimize maximum tardiness. Research report 43, Management Science Research Project, University of California, Los Angeles, USA 18. George L, Rivierre N, Supri M (1996) Preemptive and non-preemptive real-time uniprocessor scheduling. Research report RR-2966, INRIA, Le Chesnay Cedex, France 19. Short M (2009) Some complexity results concerning the non-preemptive ‘thrift’ cyclic scheduler. In: Proceedings of the 6th International Conference on Informatics in Control, Robotics and Automation (ICINCO 2009), Milan, Italy, July 2009, pp 347–350 20. Graham RL (1969) Bounds on multiprocessing timing anomalies. SIAM J Appl Math 17:416–429 21. Carlini A, Buttazzo GC (2003) An efficient time representation for real-time embedded systems. In: Proceedings of the ACM Symposium on Applied Computing (SAC 2003), Florida, USA, March 2003, pp 705–712

196

M. Short

22. Short M (2010) Improved task management techniques for enforcing EDF scheduling on recurring task sets. In: Proceedings of the 16th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS 2010), Stockholm, Sweden, April 2010, pp 56–65 23. Short M (2010) The case for non-preemptive, deadline-driven scheduling in real-time embedded systems. In: Lecture notes in engineering and computer science: Proceedings of the World Congress on Engineering 2010 (WCE 2010), vol 1. London, UK, 30 June–2 July 2010, pp 399–404

Chapter 17

Towards Linking Islands of Information Within Construction Projects Utilizing RF Technologies Javad Majrouhi Sardroud and Mukesh Limbachiy

Abstract Modern construction management require real-time and accurate information for sharing among all the parties involved to undertake efficient and effective planning as well as execution of the projects. Research projects conducted during the last decade have concluded that information management is a critical factor in construction project performance and plays an essential role in managing the construction where projects need to be completed within a defined budget and deadline. Recently, wireless sensor technologies have matured and become both technically and economically feasible and viable. This research investigates a framework for integrating the latest innovations in Radio Frequencies (RF) based information management system to automate the task of collecting and sharing of detailed and accurate information in an effective way throughout the actual construction projects. The solution presented here is intended to extend the use of a cost-effective and easy-to-implement system (Radio Frequency Identification (RFID), Global Positioning System (GPS), and Global System for Mobile Communications (GSM)) to facilitate low-cost and networkfree solutions for obtaining real-time information and information sharing among the involved participants of the ongoing construction projects such as owner, consultant, and contractor.

J. M. Sardroud (&) Faculty of Engineering, Central Tehran Branch, Islamic Azad University, Tehran, Iran e-mail: [email protected] M. Limbachiy School of Civil and Construction Engineering, Kingston University London, Kingston upon Thames, London, KT1 2EE, UK e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_17, Ó Springer Science+Business Media B.V. 2011

197

198

J. M. Sardroud and M. Limbachiy

17.1 Introduction Construction is identified internationally as one of the information-intensive industry which subject to open environment and survive harsh conditions [1, 2]. Due to the complex, unprepared, and uncontrolled nature of the construction site, not only using of automated advanced tracking and data storage technologies for efficient information management is needed but also construction industry has greatly benefited from technology in rising the speed of information flow, enhancing the efficiency and effectiveness of information communication, and reducing the cost of information transfer [3]. Missing and delayed information access constitutes 50–80% of the problems in construction. One of the major sources of information is the data collected on construction sites. Even though, collection of detailed, accurate and a sufficient volume of information and timely delivery of it is vital for effective construction management, the current situations of on-site information management methods are manually on the human ability using paper and pencil in all parts of the construction phase [4]. Previous observations on construction sites cite that 30–50% of the field supervisory personnel’s time is spent on recording and analyzing field data [5] and 2% of the work on construction sites is devoted to manual tracking and recording of progress data [6]. Data collected using manual methods are not reliable or complete due to reluctance of workers to monitor and record the flow of large quantities of elements. Data collected through these methods are usually transferred and stored in paperbased format, which is difficult to search and access, and makes processing data into useful information expensive and unreliable. Thus, some information items end up being unavailable to the parties who need access to them in a timely manner to make effective decisions [7]. Effective and immediate access to information minimizes the time and labour used for retrieving information related to each part of construction and reduces the occurrence of ineffective decisions that are made in the absence of information [8]. The process of capturing quantity of work data at a construction site needed to be improved in terms of accuracy and completeness to eliminate unnecessary communication loops and secondary tasks caused by missing or inaccurate data. These all suggest the need for a fully automatic data collection technology to capture the status information throughout construction and to integrate this data in a database automatically. The emergence of ubiquitous system which is developed in this research has the potential to enlarge the boundary of information systems from the actual work sites to the site offices and ensure real time data flow among all participants of construction projects. This research investigates the fully automated data collection using integrated applications of Radio Frequency Identification (RFID), Global Positioning System (GPS) and Global System for Mobile Communication (GSM) technologies in the construction industry which focused on the real-time exchange of information between the on and off construction sites. The system addresses a clear path for obtaining real-time information and information sharing among the involved

17

Towards Linking Islands of Information

199

participants of the construction phase such as owner, consultant, and contractor via the Internet. The solution presented here is intended to extend the use of current technologies RFID, GPS, and GSM to facilitate extremely low-cost and networkfree solutions to form the backbone of an information management system for practical communication and control among construction participants. The remainder of this paper first reviews previous research efforts that have been done by others relating to applications of wireless technologies in construction, followed by an assessment of the enabling technologies which are utilized in this research. Then it reveals the architecture of the integrated system for collecting and sharing real time information in all part of construction phases. Finally conclusions are given in the end.

17.2 Prior Research Efforts Many research projects have focused on the application of wireless technologies in the construction sector [9]. These technologies can be used for tracking, locating, and identifying materials, vehicles and equipment that lead to important changes in managerial tasks in the construction industry [10]. Recent research projects looked on the potential of using wireless technologies in the construction sector to improve the process of capturing data [11–13], some of these are discussed here. Jaselskis et al. have summarized RFID technology and surveyed its potential applications in the construction industry including concrete processing and handling, cost coding of labour and equipment, and material control [14]. Chen et al. conducted an experiment in which Bar-code technique is used to facilitate effective management of construction materials to reduce construction wastes [15]. Jaselskis and El-Misalami implemented RFID to receive and keep tracking of pipe spool in the industrial construction process. Their pilot test demonstrated that RFID could increase operation efficiency by saving time and cost in material receiving and tracking [16]. Oloufa et al. examined the use of differential GPS technology on construction sites to avoiding equipment collisions [17]. Jang and Skibniewski developed an Automated Material Tracking system based on ZigBee localization technology with two different types of query and response pulses [18]. Song et al. developed a system that can identify logistics flow and location of construction materials with better performance by using wireless sensor networks such as ZigBee technologies [19]. Majrouhi Sardroud and Limbachiya investigated the use of RFID technology in construction information delivery and management [4]. In some research efforts, authors have developed RFID based methods to automate the task of tracking and locating of construction materials and components in lay down yards and under shipping portals [20–29] and to improve the efficiency of tracking tools and movement of construction equipment and vehicles on and off construction sites [30–34]. Although, the aforementioned research has proven the value and potential of using wireless technologies, however, the reality is that the use of a cost-effective,

200

J. M. Sardroud and M. Limbachiy

scalable, and easy-to-implement information management system in an effective way at actual construction projects are scarce. This research created a framework for integrating the latest innovations in Automated Data Collection (ADC) technologies such as RFID, GPS, and GSM that address a clear path to automate collecting and sharing of detailed, accurate and a sufficient volume of information throughout the construction phase using minimal or no human efforts.

17.3 Technology Description Recently, wireless sensor technologies have matured and become both technically and economically feasible and viable to potentially support information delivery and management for construction industry. Advanced tracking and data storage technologies such as RFID, GPS, and GSM provide an automated data collection on construction phases and allow all participants to share data accurately, completely, and almost instantly. In recent years, RFID attracts attention as an alternative to the bar code and has been successfully applied to the areas of manufacturing, distribution industry, supply chain, agriculture, transportation, and healthcare [35, 36]. RFID is a method of remotely storing and retrieving data by utilizing radio frequency in identifying, tracking, and detecting various objects [37]. An early, if not the first, work exploring RFID is the landmark paper by Harry Stockman, ‘‘Communication by Means of Reflected Power’’ [38]. A RIFD system consists of tags (transponder) with an antenna, a reader (transceiver) with an antenna, and a host terminal. The RFID reader acts as a transmitter/receiver and transmits an electromagnetic field that ‘‘wakes-up’’ the tag and provides the power required for the tag to operate [3]. A typical RFID system is shown in Fig. 17.1. An RFID tag is a portable memory device located on a chip that is encapsulated in a protective shell and can be attached to any object which stores dynamic information about the object. Tags consist of a small integrated circuit chip coupled with an antenna to enable them to receive and respond to radio frequency queries from a reader. Tags can be

1 3

2 RFID Tag Fig. 17.1 A typical RFID system

RFID Reader

Host Terminal

17

Towards Linking Islands of Information

201

categorized as read-only (RO), write once, read many (WORM), and read-write (RW) in which the volume capacity of their built-in memories varies from a few bits to thousands of bits. RFID tags can be classified into active tags (battery powered) and passive tags, which powered solely by the magnetic field emanated from the reader and hence have an unlimited lifetime. Reading and writing ranges are depend on the operation frequency (low, high, ultra high, and microwave). Low frequency systems generally operate at 124, 125 or 135 kHz. High frequency systems operates at 13.56 MHz and ultra high frequency (UHF) and use a band anywhere from 400 to 960 MHz [39]. Tags operating at ultra high frequency (UHF) typically have longer reading ranges than tags operating at other frequencies. Similarly, active tags have typically longer reading ranges than passive tags. Tags also vary by the amount of information they can hold, life expectancy, recycle ability, attachment method, usability, and cost. Communication distance between RFID tags and readers may decrease significantly due to interferences by steel objects and moisture in the vicinity, which is commonplace in a construction site. Active tags have internal battery source and therefore have shorter lifetime of approximately three to ten years [16]. The reader, combined with an external antenna, reads/writes data from/to a tag via radio frequency and transfers data to a host computer. The reader can be configured either as a handheld or a fixed mount device [40]. The host and software system is an all-encompassing term for the hardware and software component that is separate from the RFID hardware (i.e., reader and tag); the system is composed of the following four main components: Edge interface/system, Middleware, Enterprise back-end interface, and Enterprise back end [14]. RFID tags are more durable and suitable for a construction site environment in comparison with Barcodes which are easily peeled off and may be illegible when they become dirty. RFID tags are not damaged as easily and do not require line-of sight for reading and writing, they can also be read in direct sunlight and survive harsh conditions, reusable, and permit remote [4]. According to the shape of assets, RFID tag can be manufactured all kinds of shapes to adapt all kinds of assets [41]. GPS is a Global Positioning System based on satellite technology. The activities on GPS were initiated by the US Department of Defence (DOD) in the early 1970s under the term Navigation Satellite Timing and Ranging System (NAVSTAR). Glonass, Galileo, and BeiDou are Russian, European Union, and Chinese Global Positioning Systems, respectively [42, 43]. GPS consists of nominally 24 satellites that provide the ranging signals and data messages to the user equipment [44]. To calculate locations, the readings from at least four satellites are necessary, because there are four parameters to calculate: three location variables and the receiver’s time [45]. To get metric or sub metric accuracy in positioning data (i.e. longitude, latitude, and altitude), a single GPS receiver is not sufficient; instead a pair of receivers perform measurements with common satellites and operate in a differential mode. DGPS provides sufficient accuracy for most outdoor tracking applications. In DGPS two receivers are used. One receiver measures the coordinates of a stationary point, called the base, whose position is perfectly known in the reference geodetic system used by GPS. The 3-D deviation between the

202

J. M. Sardroud and M. Limbachiy

measured and actual position of the base, which is roughly equal to the measurement error at a second receiver at an unknown point (called ‘‘rover’’), is used to correct the position computed by the latter [46]. GSM is a worldwide standard for cellular communications. The idea of cellbased mobile radio systems appeared at Bell Laboratories in the early 1970s. In 1982 the Conference of European Posts and Telecommunications formed the Groupe Spécial Mobile (GSM) to develop a pan-European mobile cellular radio system (the acronym later became Global System for Mobile communications). One of the current available technologies for mobile data transfer is General Packet Radio Systems (GPRS). GPRS is a packet switched ‘‘always on’’ technology which allows data to be sent and received across a mobile telephone network almost instantly, so immediacy is one of the advantages of GPRS [47].

17.4 Architecture of the Proposed System

RFID antenna

Micro Controller GSM

Motion Sensor

RFID Reader

Battery

External Sensors

Memory

GPS LED indicators

Fig. 17.2 A schematic model of U-Box

GSM antenna GPS antenna

The RFID-based ubiquitous system (U-Box) utilized in this research is combination of GPS, RFID and GSM, and as such, takes advantage of the respective strengths of each. The system could be divided into two major parts, mobile system and central station. The Mobile system mainly consists of three types of hardware components; namely, (i) GPS technology; (ii) RFID technology where passive High Frequency (HF) and Ultra High Frequency (UHF) band RFID tags is used for identifying and obtaining the object/user related information by using an RFID reader which is plugged into the mobile system; and (iii) GSM communication technology where the information (ID, specific information and date) retrieved from RFID readers and GPS is transferred to the server using GPRS or SMS. The central station consists of two servers, the application server (portal system) and the database server (project database). In this approach, data collection is done continuously and autonomously, therefore, the RFID as a promising technology is the solution for the information collection problems and the portal with GSM technology is used to solve the information communication problem in the construction industry. A schematic model of U-Box is shown in Fig. 17.2.

17

Towards Linking Islands of Information

203

As it can be seen in the block diagram, the device has a rechargeable internal battery and a motion sensor. Micro controller checks the source of power. If it’s still connected, micro controller sends a command to controller module to recharging the battery. Also, it has its own internal memory to store information (Lat, Long, Data and Time, and etc.), ability to save information when it lost GSM network, and sending saved information immediately after registering in the network. Users can attach some external sensors to U-Box so micro controller will get information of sensors and store them in internal memory then will send to data centre via defined link. In identification segment, selected RFID technology for any moving probes is active RFID where an independent power supply active RFID tags allow greater communication ranges, higher carrier frequencies, greater data transmission rates, better noise immunity, and larger data storage capacity. In positioning segment the GPS unite will be durable to function in open air conditions. The GPS receiver had a nominal accuracy of 5 m with Wide Area Augmentation System (WAAS). In data transmission segment GPRS and SMS has been selected to support the data transmission between the U-Box and the central office. GPRS connected to the GSM network via SIM card for data transfer enables several new applications which have not previously been available over GSM networks due to the limitations in message length of SMS (160 characters) such as Multimedia Messaging. In this approach, on site data collection begins with RFID tags that contain unique ID numbers and carries data on its internal memory about the host such as item specific information. It can be placed on any object/user such as materials or workers. During the construction process and at the times of moving any object, the information on the RFID tag is captured and deciphered by the RFID reader which is connected to the mobile system and indeed the micro controller gets information of GPS (which is part of U-Box) and stores the location of the object/user. The ID and location information of the object/user is then sent to a database using GSM technology. In this approach, the tags are used only for identification, and all of the related information is uploaded and stored in one or more databases which will be indexed with the same unique ID of objects. In another mechanism, information can be stored directly on the RFID tags locally and not to store any data in the server. Information update and announcement is synchronously sent via the portal and the system will effectively increase the accuracy and speed of data entry by providing owners, consultants, and contractors with the real time related information of any object/user. The application server defines various applications for collecting, sharing, and managing information. Any moving probes, such as materials handling equipment (top-slewing and bottom-slewing tower cranes, truck-mounted mobile cranes, and crawler), hoists, internal and external delivery vehicle, the gates and some key workers should be equipped with the U-Box. This intelligent system could be programmed to send back information via SMS when RFID reader or user defined sensors which are connected to the system receive new data, for example from uploaded component to the truck or detected data by sensors. Collected data will be used in the application side by using a web-based portal system for information sharing among all participants. Electronic exchange of

204

J. M. Sardroud and M. Limbachiy

collected information leads to reduction of errors and improved efficiency of the operation processes. The portal system provides an organization with a single, integrated database, both within the organization and among the organizations and their major partners. With the portal system and its coupled tools, managers and workers of each participant can conduct valuable monitor and controlling activities throughout the construction project. For instance, information is transmitted back to the engineering office for analysis and records, enables the generating of reports on productivity where this up-to-date information about construction enables effective management of project. One of the challenges of designing an effective construction information management system is designing an effective construction tagging system. Each RFID tag is equipped with a unique electronic identity code which is usually the base of reports that contain tracking information for a particular user/object. In choosing the right RFID tag for any application, there are a number of considerations, including: frequency range, memory size, range performance, form factor, environmental conditions, and standards compliance. To minimize the performance reduction of selected technology in contact with metal and concrete, RFID tags need to be encapsulated or insulated. Extremely heavy foliage or underground places like tunnels would cause the signal to fade to an extent when it can no longer be heard by the GPS or GSM antenna. When this happen, the receiver will no longer know its location and the in the case of an intelligent system application, the vehicle is technically lost and central office won’t receive information from this system. In this case to locate vehicles inside GPS blind areas, intelligent system will use RFID reader to save tag-IDs in the way through the tunnel—each tag-id shows a unique location—the device will store all information inside internal memory as a current position, and the system will send unsent data to central office when network re-established. In this research, a geo-referenced map of the construction job site should be created once, and then it will be used to identify locations of the objects/users by comparing the coordinates received from the GPS with those in the geo-referenced map.

17.5 Conclusions Proposed system is an application framework of RFID-based automated data collection technologies which focuses on the real-time collection and exchange of information among the all participants of construction project, construction site and off-site office. This system can provide low-cost, timely, and faster information flow with greater accuracy by using RFID technology, GPS, GSM, and a portal system. In this research data collection is done continuously and autonomously, therefore, the combination of selected Radio Frequencies (RF) based information and communication technologies as a powerful portable data collection tool enables collecting, storing, sharing, and reusing field data accurately, completely, and almost instantly. In this manner up-to-date information

17

Towards Linking Islands of Information

205

regarding all parts of construction phase is available which permits real-time control enabling corrective actions to be taken. The system enables collected information to be shared among the involved participants of the construction phase via the Internet which leads to important changes in the construction project control and management. The proposed system has numerous advantages. It is automatic, thus reducing the labour costs and eliminating human error associated with data collection during the processes of construction. It can dramatically improve the construction management activities which also lead to keep cost and time under control in the construction phase. The authors believe that, in practice, the approached pervasive system can deliver a complete return on investment within a short period by reducing operational costs and increasing workforce productivity.

References 1. Bowden S, Dorr A, Thorpe T, Anumba C (2006) Mobile ICT support for construction process improvement. Autom Constr 15(5):664–676 2. Behzadan H, Aziz Z, Anumba CJ, Kamat VR (2008) Ubiquitous location tracking for context-specific information delivery on construction sites. Autom Constr 17(6):737–748 3. Wang LC, Lin YC, Lin PH (2007) Dynamic mobile RFID-based supply chain control and management system in construction. Adv Eng Inform 21(4):377–390 4. Majrouhi Sardroud J, Limbachiya MC (2010) Effective information delivery at construction phase with integrated application of RFID, GPS and GSM technology. Lect Notes Eng Comput Sci 2183(1):425–431 5. McCullouch B (1997) Automating field data collection on construction organizations. In: 5th Construction Congress: Managing Engineered Construction in Expanding Global Markets, Minneapolis, USA 6. Cheok GS, Lipman RR, Witzgall C, Bernal J, Stone WC (2000) Non-intrusive scanning technology for construction status determination. Building and Fire Research Laboratory, National Institute of Standards and Technology, NIST Construction Automation Program Report no. 4 7. Ergen E, Akinci B, Sacks R (2003) Formalization and automation of effective tracking and locating of precast components in a storage yard. In: 9th EuropIA International Conference (EIA-9), E-Activities and Intelligent Support in Design and the Built Environment, Istanbul, Turkey 8. Akinci B, Kiziltas S, Ergen E, Karaesmen IZ, Keceli F (2006) Modeling and analyzing the impact of technology on data capture and transfer processes at construction sites: a case study. J Constr Eng Manag 132(11):1148–1157 9. Majrouhi Sardroud J, Limbachiya MC, Saremi AA (2009) An overview of RFID applications in construction industry. In: Third International RFID Conference, 15–16 August, 2009, Tehran, Iran 10. Majrouhi Sardroud J, Limbachiya MC, Saremi AA (2010) Ubiquitous tracking and locating of construction resource using GIS and RFID. In: 6th GIS Conference & Exhibition, (GIS 88), 6 January 2010, Tehran, Iran 11. Pradhan A, Ergen E, Akinci B (2009) Technological assessment of radio frequency identification technology for indoor localization. J Comput Civ Eng 23(4):230–238 12. Yin SYL, Tserng HP, Wang JC, Tsai SC (2009) Developing a precast production management system using RFID technology. Autom Constr 18(5):677–691

206

J. M. Sardroud and M. Limbachiy

13. Motamedi A, Hammad A (2009) Lifecycle management of facilities components using radio frequency identification and building information model. Electron J Inf Technol Constr 14(2009):238–262 14. Jaselskis EJ, Anderson MR, Jahren CT, Rodriguez Y, Njos S (1995) Radio frequency identification applications in construction industry. J Constr Eng Manag 121(2):189–196 15. Chen Z, Li H, Wong TC (2002) An application of bar-code system for reducing construction wastes. Autom Constr 11(5):521–533 16. Jaselskis EJ, El-Misalami T (2003) Implementing radio frequency identification in the construction process. J Constr Eng Manag 129(6):80–688 17. Oloufa AA, Ikeda M, Oda H (2003) Situational awareness of construction equipment using GPS, wireless and web technologies. Autom Constr 12(6):737–748 18. Jang WS, Skibniewski MJ (2007) Wireless sensor technologies for automated tracking and monitoring of construction materials utilizing Zigbee networks. In: ASCE Construction Research Congress: The Global Construction Community, Grand Bahamas Island 19. Song J, Haas CT, Caldas CH (2007) A proximity-based method for locating RFID tagged objects. Adv Eng Inform 21(4):367–376 20. Song J, Haas CT, Caldas CH (2006) Tracking the location of materials on construction job sites. J Constr Eng Manag 132(9):911–918 21. Caron F, Razavi SN, Song J, Vanheeghe P, Duflos E, Caldas CH, Haas CT (2007) Locating sensor nodes on construction projects. Auton Robot 22(3):255–263 22. Ergen E, Akinci B, Sacks R (2007) Life-cycle data management of engineered-to-order components using radio frequency identification. Adv Eng Inform 21(4):356–366 23. Ergen E, Akinci B, Sacks R (2007) Tracking and locating components in a precast storage yard utilizing radio frequency identification technology and GPS. Autom Constr 16(3):354– 367 24. Yu SN, Lee SY, Han CS, Lee KY, Lee SH (2007) Development of the curtain wall installation robot: performance and efficiency tests at a construction site. Auton Robot 22(3):281–291 25. Tzeng CT, Chiang YC, Chiang CM, Lai CM (2008) Combination of radio frequency identification (RFID) and field verification tests of interior decorating materials. Autom Constr 18(1):16–23 26. Jang WS, Skibniewski MJ (2008) A wireless network system for automated tracking of construction materials on project sites. J Constr Eng Manag 14(1):11–19 27. Torrent DG, Caldas CH (2009) Methodology for automating the identification and localization of construction components on industrial projects. J Comput Civ Eng 23(1):3–13 28. Majrouhi Sardroud J, Limbachiya MC (2010) Improving construction supply chain management with integrated application of RFID technology and portal system. In: The 8th International Conference on Logistics Research (RIRL 2010), Sept. 29–30 and Oct. 1st, 2010, Bordeaux, France 29. Majrouhi Sardroud J, Limbachiya MC (2010) Integrated advance data storage technology for effective construction logistics management. In: 27th International Symposium on Automation and Robotics in Construction (ISARC 2010), June 25–27, 2010, Bratislava, Slovakia 30. Naresh AL, Jahren CT (1997) Communications and tracking for construction vehicles. J Constr Eng Manag 123(3):261–268 31. Sacks R, Navon R, Brodetskaia I, Shapira A (2005) Feasibility of automated monitoring of lifting equipment in support of project control. J Constr Eng Manag 131(5):604–614 32. Goodrum PM, McLaren MA, Durfee A (2006) The application of active radio frequency identification technology for tool tracking on construction job sites. Autom Constr 15(3):292– 302 33. Lee UK, Kang KI, Kim GH, Cho HH (2006) Improving tower crane productivity using wireless technology. Computer-Aided Civ Inf Eng 21(8):594–604 34. Lu M, Chen W, Shen X, Lam HC, Liu J (2007) Positioning and tracking construction vehicles in highly dense urban areas and building construction sites. Autom Constr 16(5):647–656

17

Towards Linking Islands of Information

207

35. Nambiar AN (2009) RFID Technology: A Review of its Applications. Lect Notes Eng Comput Sci 2179(1):1253–1259 36. Huang X (2008) Efficient and reliable estimation of tags in RFID systems. Lect Notes Eng Comput Sci 2169(1):1169–1173 37. Majrouhi Sardroud J, Limbachiya MC (2010) Utilization of advanced data storage technology to conduct construction industry on clear environment. In: International Conference on Energy, Environment, and Sustainable Development (ICEESD 2010), June 28–30, 2010, Paris, France 38. Landt J (2005) The history of RFID. IEEE Potentials 24(4):8–11 39. ERABUILD (2006) Review of the current state of Radio Frequency Identification (RFID) technology, its use and potential future use in construction. National Agency for Enterprise and Construction, Tekes, Formas and DTI, Final Report 40. Lahiri S (2005) RFID sourcebook. IBM Press, Upper Saddle River 41. Su CJ, Chou TC (2008) An radio frequency identification and enterprise resource planningenabled mobile asset management information system. Lect Notes Eng Comput Sci 2169(1):1837–1842 42. Kaplan ED, Hegarty CJ (2006) Understanding GPS, principles and applications. Artech House, Inc., Norwood 43. Xu G (2007) GPS: theory algorithms and applications. Springer, Berlin 44. Kupper A (2005) Location-based services, fundamentals and operation. Wiley, West Sussex 45. French GT (1996) Understanding the GPS—an introduction to the global positioning system. GeoResearch, Inc., Bethesda 46. Peyret F, Betaille D, Hintzy G (2000) High-precision application of GPS in the field of realtime equipment positioning. Autom Constr 9(3):299–314 47. Ward M, Thorpe T, Price A, Wren C (2004) Implementation and control of wireless data collection on construction sites. Electron J Inf Technol Constr (ITcon) 9:297–311

Chapter 18

A Case Study Analysis of an E-Business Security Negotiations Support Tool Jason R. C. Nurse and Jane E. Sinclair

Abstract Active collaboration is undoubtedly one of the most important aspects within e-business. In addition to companies collaborating on ways to increase productivity and cut costs, there is a growing need for in-depth discussion and negotiations on their individual and collective security. This paper extends previous work on a tool aimed at supporting the cross-enterprise security negotiations process. Specifically, our goal in this article is to briefly present a case study analysis and evaluation of the usage of the tool. This provides further real-world insight into the practicality of the tool and the solution model which it embodies.

18.1 Introduction E-business has matured into one of the most cost-efficient and streamlined ways of conducting business. As the use of this new business paradigm thrives however, ensuring adequate levels of security for these service offerings emerges as a critical goal. The need for security is driven by an increasing regulatory and standards requirements base (e.g. EU Data Protection Act and US Sarbanes–Oxley Act) and escalating security threats worldwide (as indicated in [1]). Similar to the businesslevel collaborations necessary to facilitate these interactions, there also needs to be a number of discussions and negotiations on security. A key problem during collaborations however is the complex discussion task that often ensues as J. R. C. Nurse (&) J. E. Sinclair Department of Computer Science, Warwick University, Coventry, CV4 7AL, UK e-mail: [email protected] J. E. Sinclair e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_18, Ó Springer Science+Business Media B.V. 2011

209

210

J. R. C. Nurse and J. E. Sinclair

companies have different security postures, a range of disparate security needs, may have dissimilar laws/regulations which they each subscribe to, have different skill sets/experience levels and so on. Owing to these and other challenges, [2] aptly labels the related process as ‘security mayhem’. With appreciation of the collaboration difficulties highlighted above, particularly in terms of security approaches in Web services-based interactions, in previous work we have presented BOF4WSS, a Business-Oriented Framework for enhancing Web Services Security for e-business [3, 4]. The framework’s novelty stemmed from its concentration on a cross-enterprise development methodology to aid collaborating e-businesses in jointly creating secure and trusted interactions. Additionally, BOF4WSS aims to fit together a majority of the critical pieces of the WS security puzzle (for example, key new approaches such as [5, 6]) to propose a well-rounded, highly structured, extensible framework (framework and methodology being synonymous in the context of this work). Progressing from the BOF4WSS methodology itself, our emphasis has shifted to supplying software to support it and assist in its seamless application to business scenarios. In previous articles we have presented (see [7]) and initially evaluated (see [8]) one of these tools, which was developed to support and ease security negotiations across collaborating e-businesses. In terms of BOF4WSS, this refers specifically to easing the transition from the individual Requirements Elicitation stage to the subsequent joint Negotiations stage. Generally, some of the main problems identified and targeted included, understanding other companies’ security documentation, understanding the motivation behind partnering businesses’ security needs/decisions, and being able to easily match and compare security decisions from entities which target the same situation and risk. Related work in [9, 10] and feedback from interviewed security practitioners supports these issues. Building on previous research, the aim of this paper therefore is to extend initial evaluation work in [8] and pull together the compatibility evaluation of the tool and the Solution model it embodies through the use of a case scenario analysis. This enables a more complete evaluation of the proposals because, unlike the compatibility assessment in [8], it progresses from the initial model stages to the final tool output produced. The scenario contains two companies using two popular system-supported security Risk Management/Assessment (RM/RA) methods. Topics to be considered in this analysis include: how tool data is transferred to the RM/RA approaches/software (as expected in the Solution model [7]); How is typical RM/RA approach information represented in the tool’s common, custombuild XML-based language; and finally, how close, if at all, can the tool bring together the different RM/RA approaches used by companies to ease stage transition within BOF4WSS. If the tool can interplay with a majority of the securityrelated information output from popular RM/RA techniques, its feasibility as a system that can work alongside current approaches used in businesses today, will be evidenced. The next section of this paper reviews the Solution model and resulting tool to support security negotiations across e-businesses. Then, in Sect. 18.3, we give a brief background on the business scenario and begin the case study analysis.

18

A Case Study Analysis of an E-Business Security Negotiations Support Tool

211

Findings are discussed as they are found. Section 18.4 completes this contribution by providing conclusions and outlining directions for future work.

18.2 Solution Model and Tool: A Recap The Solution model is the conceptual base for the software tool developed in our research. It was initially presented in [7] and consists of four component stages. These are: Security Actions Analysis, Ontology Design, Language Definition and Risk Catalogue Creation. Security actions analysis This stage focuses on reviewing the literature in the security risk management field, and critically examining how security actions and requirements are determined. A security action is broadly defined as any way in which a business handles the risk it faces (e.g. ‘insurance will be purchased to protect against very skilled and sophisticated hacker attacks’), and a security requirement is a high-to-medium level desire, expressed to mitigate a risk (e.g. ‘classified information must be encrypted when transferred over a network connection’). The key outcome of this stage is a thorough understanding of the relevant security domain which could then be used as a foundation for future stages. Ontology design The aim of this component is to produce a high-level ontology design using the findings from the previous stage, to establish a common understanding and semantics structure of the security actions (and generally security risk management) domain. This common or shared understanding is a critical prerequisite when considering the difficulties businesses face (because of different terminologies used, RM/RA methods applied, and so on) as they try to understand their partners’ security documentation which is supplied in BOF4WSS’ Negotiations phase. Further detail on the Security Actions Analysis and Ontology Design stages (inclusive of draft ontology) can be seen in [11]. Language definition This stage has two sub-components. First is the development of a XML-based language called Security Action Definition Markup Language (SADML). This allows for the establishment of a common format (based on the ontology) by which security actions/requirements information provided by companies is formally expressed, and also later processed by the resulting tool. Second is the proposal of a user-friendly interface such as a data entry screen or template document by which businesses’ security-related data could be entered, and subsequently marked up in SADML. This interface would act as a guide for companies in prompting them to supply complete information as they prepare to come together for negotiations. Risk catalogue creation This final component stage addresses the problem of matching and comparing security actions/requirements across enterprises by defining a shared risks catalogue. Given that businesses use risks from this shared catalogue as input to their RM/RA methods, regardless of the security actions that they decide individually, the underlying risks could be used by the tool to automatically match their actions. To increase flexibility, the catalogue would feature an extensive and updatable set of security risks.

212

J. R. C. Nurse and J. E. Sinclair

Fig. 18.1 Process flow of implemented solution model

risks (assets, threats, vulnerabilities)

Comp A’s risk management methodology

Risks (assets, threats, vulnerabilities) catalogue New risks exchanged

all security actions & factors motivating them, inclusive of risks, laws, policies, etc.

Comp B’s risk management methodology

...

Data entry & data storage system

Data entry & data storage system

...

all security data available

Encoding system (based on language) Comp A’s encoded security actions & factors motivating them

...

Encoding system (based on language) Comparison system (matching based on risk)

...

(i) User-friendly interface where security actions and the related security risks, are automatically matched and displayed (ii) Inconsistencies flagged that represent exceptional situations and thus should be discussed by personnel

With a recap of the Solution model now provided, Fig. 18.1 shows a process flow of how the implemented model i.e. the tool, works. In this diagram, Comp A and Comp B are companies using BOF4WSS for an online business scenario. To explain the process flow: First, companies would select a set of risks from the catalogue that apply to their particular business scenario, and use these as input to their different RM/RA methodologies. Any new risks to be considered which are not available in the catalogue can be exchanged for this scenario. After companies have used their RM/RA approaches to determine their individual security actions (inclusive of motivational factors), these are then input into the Data entry and storage system. This system uses a user-friendly interface to read in the data (as suggested in the Language Definition stage), and stores it to a back-end database to allow for data retrieval, updating and so on. This interface, and generally the tool, mirror the understanding of concepts defined in the ontology. As companies are about to come together for Negotiations, the Encoding system is used to read security data from the database and encode it into SADML. In the Negotiations stage of BOF4WSS, companies bring their individual SADML documents and these are passed to the tool’s Comparison system. This system matches companies’ security actions based on risks which they address, and aims to provide a user-friendly interface in which (i) security actions can be quickly compared and discussed, (ii) any inconsistencies would be flagged for follow-up by personnel, and (iii) a shared understanding of security terms, risks and so on, will be upheld due to the references that can be made to the ontology. Next, in Sect. 18.3, we conduct the case study analysis to give further insight into the use and practicality of the tool proposed.

18.3 Case Study Analysis The core aim of this section is to complete the compatibility and feasibility evaluation first presented in [8], using a full case study analysis. In that previous work, a very detailed discourse and a number of mappings were presented. Now

18

A Case Study Analysis of an E-Business Security Negotiations Support Tool

213

the objective is to put that and other aspects of the Solution model and tool into a more real-world context. In addition to further supporting the feasibility of this research’s proposals, this would enable a more thorough evaluation of the model as it progressed from the initial Central Risk Catalogue to the final tool output. The case scenario to be used consists of two businesses, Buyer and Supplier. These companies have worked with each other in the past using mainly manual and other offline interactions. To enable their processes to be more integrated and streamlined, the parties are now choosing to use the Internet and WS technology suite for online business-to-business communications. As security is a key priority for companies, they are adopting BOF4WSS to aid in the creation of a secure WS-based business scenario. In line with this paper, the areas of focus are the progression from the Requirements Elicitation stage to the Negotiation stage. This involves the passing and then negotiation on entities’ security needs and requirements. In terms of RM/RA and determining security needs and requirements, EBIOS and CORAS are the two methods used by entities. EBIOS is a risk management approach for assessing and treating risks in the field of information systems security [12]. CORAS is a tool-supported methodology for model-based risk assessment of security-critical systems [13]. Specifically, to analyze risk and determine security actions, Buyer uses EBIOS and its software, whereas Supplier employs CORAS and its supporting tool. Next, we begin the case study analysis. According to the Solution model flow (see Fig. 18.1), regardless of the RM/RA method used, the starting point of the scenario should be a common risks base or catalogue. This point however is where one of the first difficulties in the evaluation surfaced. When the model was first conceived it was assumed that the transferring of common risk data to RM/RA approaches would be done manually. During the completion of this study however, such a process actually proved somewhat tedious. This is especially in terms of accurate and consistent mapping of data from the common risks catalogue to the RM/RA methods and software. If there was a risk to the confidentiality of Web services messages in the Risk Catalogue system therefore, the problem was: how could that data and the related data on vulnerabilities, threats and assets, be quickly, accurately and consistently entered into the RM/RA approaches and their software. Figure 18.2 depicts the area of focus in the ‘Process flow of the implemented Solution model’ diagram (Fig. 18.1). Possibly the best solution to this problem resides in the automated mapping of data from the Central Risk Catalogue to the RM/RA method software, which in this case is represented by the EBIOS and CORAS tools (used by Buyer and Supplier respectively). Two options were identified by which this could occur. The first option consisted of adding an export capability to the Central Risk Catalogue system, which would output data on risks in the machine-readable formats of common RM/RA approach software. This is beneficial because it would be a central point where numerous RM/RA software formats could be generated. Furthermore, it could take advantage of the ‘Import’ and ‘Open File/Project’, functionalities which are standard in a number of RM/RA software. For example, both CORAS and EBIOS tools have these capabilities.

214 Fig. 18.2 Area of focus in process flow of implemented solution model

J. R. C. Nurse and J. E. Sinclair Risks (assets, threats, vulnerabilities) catalogue risk to the confidentiality of Web services messages

risk to the confidentiality of Web services messages

Supplier’s CORAS methodology & software

New risks exchanged

Buyer’s EBIOS methodology & software

One caveat noticed when assessing the Risk Catalogue export capability option is that unique identification numbers (IDs) for elements (e.g., Menace IDs in EBIOS or risk-analysis-result IDs in CORAS) generated by the Central system might conflict with the same element IDs generated by the actual software running at each company. There would therefore need to be some agreed allotment of ID ranges for the Catalogue-based option to function properly. The second option suggests a more decentralized implementation where extensions could be added to the RM/RA software systems to enable them to read in and process Risk Catalogue system data. This would avoid the problem of conflicting IDs, but introduces the need to access, understand and edit various software systems. For this case, EBIOS and CORAS are good candidates in this regard as both are open source implementations (see [12] and [13] respectively). Apart from the programming that would be necessary in both options above, there is the question of exactly how to map Risk Catalogue system data to EBIOS and CORAS. This however can be largely addressed by reversing the mapping tables used as the basis for previous evaluation work in [8]. This is because the tool’s Entity Relationship Diagram (ERD) is not dissimilar to that of the Risk Catalogue system. Essentially, one would now be going from database records to EBIOS and CORAS software XML formats. Risk, ProjectRisk, Asset, Vulnerability and Threat are some of the main database tables mapped in [8] that would be used in reverse to map risks data from the Catalogue system. Having briefly digressed from the case study to discuss how transferring data from the shared risks catalogue could be addressed, the focus resumes at the RM/ RA software stage. This relates to the bottom two boxes in Fig. 18.2. After Supplier and Buyer have agreed the risks to be used, they conduct their individual analyses. This generally encompasses the processes of risk estimation, risk evaluation and treatment. The two code snippets below give an initial idea of the data generated by each entity’s RM/RA method. This and most of the following examples are based around a security risk defined by companies relating to the integrity and confidentiality of Web services messages passed between them during online interactions. Hereafter, this is referred to simply as Risk101; ‘Risk101’ is also used as the lower-level ID value originating from the risks catalogue which is employed in each company’s RM/RA software. From the code snippets, one can see exactly how different the representations of the same risk may be from company to company. As would be expected, a similar reality exists regarding the other types of data produced (e.g. related to risk factors, risk estimates, security actions and so on). The + sign in the code indicates that there is additional data which is not displayed/expanded considering space limitations.

18

A Case Study Analysis of an E-Business Security Negotiations Support Tool

215

+ <ScenarioPotentiality potentiality="Potentiality.1076645892186">

Code Snippet #-1. EBIOS (Buyer) representation of the risk Risk101 WSMessage Eavesdropping and tampering with data in a Web services' message (in transit) Medium Low

Code Snippet #-2. CORAS (Supplier) representation of the risk With the RM/RA methodologies at each business complete, the next step was mapping the output data from Buyer and Supplier to the tool. This process was covered in detail in [8] and therefore is not analyzed in depth here. From a case study perspective however, one intriguing additional observation was made—that is, although RM/RA methods did not accommodate certain data in a very structured way as expected by the tool, it did not mean that the data was not present in companies’ considerations. In Supplier’s CORAS software output shown in Code Snippet 3 for example, it is apparent that a limited security budget influenced Supplier’s treatment strategy decision (see treatmentDescription columnId). Any automated mapping to our tool therefore should ideally capture this data as a unique Risk Treatment factor. To recap, a treatment factor is an aspect that influences or in some way motivates a particular treatment for a risk. Common examples are laws, regulations, security policies, limited budgets and contractual obligations. Capturing this treatment data was not possible however because the machine-readable output of CORAS does not distinctly define such aspects in its XML structure. Here it is just in plain text. TRT101 Risk101 Retain The unlikeliness of this risk and a limited security budget are the reasons for risk acceptance Threat_Analysis09.doc

Code Snippet #-3. CORAS representation of a risk treatment A similar situation is present in Buyer’s EBIOS output regarding risk estimation. In this case, Buyer has used EBIOS to prioritize risks, however, because their technique is so elaborate it does not allow for a clear and reliable automated mapping to the risk level concepts in the tool.

216

J. R. C. Nurse and J. E. Sinclair

To tackle these mapping issues a few other techniques were assessed but manual mapping proved to be the only dependable solution. This mapping involved noting the type of data requested by the tool (such as influential security policies or budgetary limitations) and using its data entry screens to manually enter that data. This was easily done in this case through the creation of a TreatmentFactor record in the tool and then linking that record to the respective risk treatment, formally the SecurityAction database record. The TreatmentFactor table is used to store elements that influence or affect the treatment of risks. Examples of such were mentioned previously. Regarding the manual risk estimation and prioritization mapping needed for EBIOS mapping, a level of subjectivity would be introduced as users seek to map values in their analysistotherisklevelsexpectedinthetool.Tocompensateforthissubjectivity,detailed justifications and descriptions of chosen risk levels should be provided by parties. This information would be entered in the tool’s respective RiskEstimate database record’s risk_level_remarks, probability_remarks, impact_remarks and adequacy_of_ controls_remarks fields. (The RiskEstimate table defines the value of a risk, the probability and impact of it occurring, and the effectiveness of current controls in preventing that risk.) Generally, at the end of mapping, companies’ personnel should browse screens in the tool to ensure that all the required information has been transferred. The next step in the case study was encoding each business’s mapped data (now in the tool’s database) to SADML documents. This process went without error. In Code Snippet 4, an example of the security risk under examination (Risk101) is presented. The marked-up risk data has the same basis across businesses and documents due to the use of the shared risks base in the beginning. SADML provides the common structure, elements and attribute names. Different companies may add varying comments or descriptions however. The specific code in Snippet 4 is from Buyer. Eavesdropping and tampering with data in a Web services' message (in transit) Malicious party Circulating information in inappropriately secured formats property:data web service message The data carried in the message is the key aspect Violation of confidentiality using eavesdropping ...

Code Snippet #-4. SADML representation of the highlighted risk

18

A Case Study Analysis of an E-Business Security Negotiations Support Tool

217

The real difference in SADML documents across Buyer and Supplier is visible when it comes to the treatment of Risk101. In this case, Buyer aims to mitigate this risk while Supplier accepts it. SADML Code Snippet 5 shows this and the respective treatment factors. On the left hand side is Buyer’s document and on the right, Supplier’s. The + sign indicates that there is additional data which is not displayed here. <mitigationAction> Protect against eavesdropping on Web service messages being transmitted between partners <details>The organization must take measures to ensure there is no eavesdropping on data being transmitted between Web services across business parties. + + + <securityPolicyRefs> <securityBudgetRefs /> + <securityRequirementRefs>

The unlikeliness of this risk and a limited security budget are the reasons for risk acceptance <details>Threat_Analysis09.doc + <securityPolicyRefs> + <securityBudgetRefs> +

Code Snippet #-5. SADML representations of companiesGúø risk treatment choices When compared to the original output from EBIOS and CORAS, one can appreciate the use of the standard format supplied by SADML. In this respect, SADML provides a bridge between different RM/RA methods and their software systems, which can then be used as a platform to compare high-level security actions across enterprises. It is worth noting that the benefits possible with SADML are largely due to its foundation in the well-researched ontology from the Solution model [7, 11]. With all the preparatory stages in the case process completed, Fig. 18.3 displays the output of the model’s final stage i.e. the tool’s Comparison System which is presented to personnel at Buyer and Supplier. Apart from the user-friendly, colour-coded report, the real benefit associated with this output is the automation of several of the preceding steps taken to reach this point. These included: (i) gathering data from RM/RA approaches (such as EBIOS and CORAS), albeit in a semi-automated fashion; (ii) allowing for influential factors in risk treatment that are key to forthcoming negotiations, to be defined in initial stages; and finally, (iii) matching and comparing the security actions and requirements of companies based on shared risks faced. The output in Fig. 18.3 also aids in reconciling semantic differences across RM/ RA approaches as these issues are resolved by mapping rules earlier in the process (as covered in [8]). Furthermore, personnel from companies can refer to the

218

J. R. C. Nurse and J. E. Sinclair

Fig. 18.3 Area of focus in process flow of implemented solution model

ontology and the inclusive shared definitions/terminology at any point. This would be done to attain a clear understanding on terms in the context of the interactions. As parties come together therefore, they can immediately identify any conflicts in treatment choices and have the main factors supporting those conflicting choices displayed. This and the discussion above give evidence to show that in many ways, our tool has brought the interacting enterprises closer together—particularly in bridging a number of key gaps across companies. This therefore allows for an easier transition between the Requirement Elicitation and Negotiation phases in BOF4WSS. The shortcomings of the tool identified in this section’s case study centered on the manual effort needed at a few stages to complete data mapping. This acted to limit some of the Solution model’s automated negotiations support goals. To critically consider this point however, the level of automation and support that is present now would significantly bridge the disparity gaps and support a much

18

A Case Study Analysis of an E-Business Security Negotiations Support Tool

219

easier negotiation on security actions between parties. A small degree manual intervention therefore, even though not preferred, might be negligible. This is especially in business scenarios where there are large amounts of risks or security actions to be deliberated, and thus saving time at any point would result in substantial boosts in productivity.

18.4 Conclusion and Future Work The main goal of this paper was to extend initial evaluation work in [8] and pull together the compatibility evaluation of tool and generally the Solution model it embodies through the use of a full case study analysis. The findings from this new and more complete analysis are seen to supply further evidence to support the tool as a useful, feasible and practical system to aid in cross-enterprise security negotiations. This is especially in terms of BOF4WSS but there might also be other opportunities for its use in other collaborative e-business development methodologies. The main benefit of the tool and model are to be found in a much easier negotiation process which then results in significantly increased productivity for companies. There are two prime avenues for future work. The first avenue consists of testing the tool with other RM/RA techniques; IT-Grundschutz Manual [14] and NIST Risk Management Guide for Information Technology Systems SP800-30 [15] are some of the methods under investigation for this task. Positive evaluation results would further support the tool and any justified nuances of those popular techniques would aid in its refinement. The second avenue is more generic and looks towards the research and development of additional approaches and systems to support BOF4WSS. Considering the comprehensive and detailed nature of the framework, support tools could be invaluable in promoting BOF4WSS’s use and seamless application to scenarios.

References 1. PricewaterhouseCoopers LLP. Information Security Breaches Survey 2010 [Online]. Available: http://www.pwc.co.uk/eng/publications/isbs_survey_2010.html 2. Tiller JS (2005) The ethical hack: a framework for business value penetration testing. Auerbach Publications, Boca Raton 3. Nurse JRC, Sinclair JE (2009) BOF4WSS: a business-oriented framework for enhancing web services security for e-Business. In: 4th International Conference on Internet and Web Applications and Services. IEEE Computer Society, pp 286–291 4. Nurse JRC, Sinclair JE (2009) Securing e-Businesses that use Web Services — A Guided Tour through BOF4WSS. Int J Adv Internet Technol 2(4):253–276 5. Steel C, Nagappan R, Lai R (2005) Core security patterns: best practices and strategies for J2EETM, web services and identity management. Prentice Hall PTR, Upper Saddle River 6. Gutierrez C, Fernandez-Medina E, Piattini M (2006) PWSSec: process for web services security. In: IEEE International Conference on Web Services, pp 213–222

220

J. R. C. Nurse and J. E. Sinclair

7. Nurse JRC, Sinclair JE (2010) A solution model and tool for supporting the negotiation of security decisions in e-business collaborations. In: 5th International Conference on Internet and Web Applications and Services. IEEE Computer Society, pp 13–18 8. Nurse JRC, Sinclair JE (2010) Evaluating the compatibility of a tool to support e-businesses’ security negotiations. In: Lecture notes in engineering and computer science: Proceedings of the World Congress on Engineering 2010, WCE 2010, London, UK, pp 438–443 9. Yau SS, Chen Z (2006) A framework for specifying and managing security requirements in collaborative systems. In: Yang LT, Jin H, Ma J, Ungerer T (eds) Autonomic and Trusted Computing, ser. Lecture Notes in Computer Science, vol 4158. Springer, Heidelberg, pp 500–510 10. Todd M, Zibert E, Midwinter T (2006) Security risk management in the BT HP alliance. BT Technol J 24(4):47–52 11. Nurse JRC, Sinclair JE (2009) Supporting the comparison of business-level security requirements within cross-enterprise service development. In: Abramowicz W (ed) Business Information Systems, ser. Lecture Notes in Business Information Processing, vol 21. Springer, Heidelberg, pp 61–72 12. DCSSI (2004) Expression des besoins et identification des objectifs de securite (EBIOS)— Section 1–5, Secretariat General de la Defense Nationale. Direction Centrale de la Securitec des Systecmes D’Information, Technical Report 13. den Braber F, Braendeland G, Dahl HEI, Engan I, Hogganvik I, Lund MS, Solhaug B, Stolen K, Vraalsen F (2006) The CORAS model-based method for security risk analysis. SINTEF, Technical Report 14. Federal Office for Information Security (BSI). IT-Grundschutz Manual [Online]. Available: https://www.bsi.bund.de/EN/Topics/ITGrundschutz/itgrundschutz_node.html 15. National Institute of Standards and Technology (NIST) (2002) Risk management guide for information technology systems (Special Publication 800-30), Technical Report

Chapter 19

Smart Card Web Server Lazaros Kyrillidis, Keith Mayes and Konstantinos Markantonakis

Abstract In this article (based on ‘‘Kyrillidis L, Mayes K, Markantonakis K (2010) Web server on a SIM card. Lecture notes in engineering and computer science: Proceedings of the World Congress on Engineering 2010, WCE 2010, 30 June–2 July 2010, London, UK, pp 253–259’’) we discuss about the integration of a web server on a SIM card and we attempt an analysis from various perspectives (management, operation, security). A brief representation of the Smart Card Web Server (SCWS) will take place along with a use case that will help the reader to identify the way that an SCWS can be used in practice, before we reach to a final conclusion.

19.1 Introduction The World Wide Web (WWW) was a major step forward for humanity in terms of communication, information and entertainment. Originally, the web pages were static, not being changed very often and without any user interaction. This lack of interactivity led to the creation of server side scripting languages (like PHP) that allowed the creation of dynamic pages. These pages are often updated according to the users’ interests and in recent years even their content is created from the users (blogs, social networking, etc.). In order for these pages to be properly created L. Kyrillidis (&) Information Security Strategy Consultant, Agias Lavras 3, Neapoli, Thessaloniki, Greece e-mail: [email protected] K. Mayes K. Markantonakis Smart Card Centre, Royal Holloway, University of London, London, UK e-mail: [email protected] K. Markantonakis e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_19, Springer Science+Business Media B.V. 2011

221

222

L. Kyrillidis et al.

and served, a special type of computer program is required. This program known as a Web Server accepts the users’ requests, processes them and returns the result to the requesting user’s browser. Another important step for modern communications was the invention of mobile phones. The first devices suffered from fraud, because it was quite easy to intercept communications and clone phones. This inevitably led to the introduction of a secure, tamper-resistant module that could be used for securing the storage of sensitive information and cryptographic keys/algorithms that were used to secure communication between the phone and the network provider’s facilities. This module is referred to as the Subscriber Identity Module (SIM). The idea of hosting a web server on a SIM was proposed almost a decade ago [1] and although is not yet commercially available, recent technological advances suggest that the idea could be reconsidered. While the integration of SIM and web server could offer new fascinating prospects to both the network providers and the users, there is a security concern that it might also help attackers to gain access to the SIM contents. A practical concern is the extent of the added management, operation and personalization costs that this integration would entail.

19.2 Web Server on A Sim Card The Open Mobile Alliance is an international body that is formed to produce and promote standards for the mobile communications industry in order to encourage the interoperability of products aiming at lower operational costs and higher quality products for the end users [2]. One of the standards that OMA created, was the web server on the SIM card standard (Smart Card Web Server—SCWS) [3-5]. This standard defines a number of entities that the web server must contain: • SCWS The web server itself. It is located inside the SIM card. • SCWS gateway This entity is needed when the SIM card cannot directly respond to HyperText Transport Protocol (HTTP) requests, so the gateway’s main purpose is to translate the browser’s requests from HTTP to the local transport protocol and vice versa. A common local protocol would be the Bearer Independent Protocol (BIP) [6]. Additionally, the gateway is proposed to host a form of Access Control List (ACL) to control the access to the SCWS. It is located on the phone. • HTTP (s) Client The browser that will initiate requests towards the SCWS and will present the response to the end user. It is located on the phone. • SCWS Administration Application This entity is used for SCWS software updates/patches that would be applied remotely to the SCWS and for installing and/or updating possible web applications that may run on the SIM card. Additionally, it may be used to send new content in the form of HTML pages to the SCWS. It is located in the network provider’s premises.

19

Smart Card Web Server

223

19.2.1 Communication Protocols The SCWS will use two different protocols to communicate with entities outside of the SIM card. The first will be the BIP protocol to encapsulate HTTP(s) packets when the SIM card does not implement its own TCP/IP stack, while the second one will be the HTTP(s) for the newer smart cards that will allow direct HTTP access. We will now take a more detailed look at these two protocols and how they can be used with SCWS (according to [3-5]): 19.2.1.1 BIP Protocol As mentioned earlier, the BIP protocol will allow incoming and outgoing communication when the SIM card cannot support direct HTTP access. The SIM card will work into two modes: • Client mode The SCWS communicates with the remote administration entity to receive updates. The gateway translates requests from BIP to HTTP(s) (the SIM card can ‘‘speak’’ BIP, while the remote administration will ‘‘speak’’ HTTP(s)). • Server mode The SCWS communicates with the browser. The gateway is once again present, executing the translation between the BIP protocol that the SIM card understands and the HTTP(s) requests/answers that the browser understands. 19.2.1.2 TCP/IP Protocol If the SIM card implements its own TCP/IP stack (from Java card 3.0 and onwards), there will be no need for the gateway and the communication will take place directly between the two entities (either the SCWS and remote server or the SCWS and the browser).

19.2.2 Administration Protocols When the SCWS is in client mode this means that it exchanges messages with the remote server. There are two ways for this message exchange to take place: • When the amount of data that must be exchanged is relatively small, then the Lightweight Administration Protocol should be used. The bearer of the commands that encapsulate the data is an SMS or multiple SMS and when the SCWS receives these SMS(s), it parses it/them and then it must send a response back to the remote server, so that the latter can determine if it can send the next command or simply terminate the connection. • A second way that can be used for administration purposes is the Full Administration Protocol. A card administration agent that is located inside the SIM card is responsible to encapsulate and transfer the HTTP messages over PSK-TLS,

224

L. Kyrillidis et al.

to establish connection and if necessary reconnect if the connection is dropped. The agent sends a message to the remote server and the later responds with the administration command encapsulated within an HTTP response. The agent receives the command, passes it to SCWS which executes it. When the command is executed, the agent contacts the remote server for the next command. This operation continues until the remote server terminates the connection.

19.2.3 SCWS URL, IP, Port Numbers As mentioned earlier, there will be two ways to communicate with the SCWS: either over HTTP or BIP. The port numbers that will be used when the HTTP requests are encapsulated inside BIP packets are 3516 for HTTP and 4116 for HTTPs. The format of the URL in both cases will be: http://127.0.0.1:3516/file1/file2/test.html https://127.0.0.1:4116/sec_file1/sec_test.html If the access to the SCWS is provided directly over HTTP (s) the port numbers are the ones used for traditional web servers (80 for HTTP and 443 for HTTPs). The SIM card will now have its own IP address, so the loopback will no longer be needed. The format of the URL will be: http://\smart_card_IP[[:80]/file1/file2/test.html https://smart_card_IP[[:443]/sec_file1/sec_test.html

19.3 Using the SCWS for E-Voting A possible use for the SCWS is given in the following example: A country X provides to all its citizens an ID card that store (in addition to the citizen’s name and ID card number), two certificates (one for encryption/decryption, one for digital signatures), the corresponding private keys and the government’s public keys (for a similar example, see [7]). These certificates are also installed in a central location that is being administered by the government. The citizen can use his ID card for every transaction, either with the state or with other citizens. Additionally, the country has arranged with the mobile network providers to install these certificates/keys on the citizen’s mobile phone, so that the later can use it to vote. The voting process is described in the following use case:

19.3.1 Process Flow Let us suppose the following: CertA1 is the user’s certificate, PA1 the private key and PUA1 the public key used for encryption/decryption and CertA2 is the user’s

19

Smart Card Web Server

225

certificate, PA2 the private key and PUA2 the public key that are used for digital signatures. Likewise, CertB1 is the government’s certificate, PB1 the private key and PUB1 the public key used for encryption/decryption and CertB2 is the government’s certificate, PB2 the private key and PUB2 the public key that are used for digital signatures. Also let H be the hash algorithm that the two parties will use. The e-voting process is as follows: • The network provider has updated the user’s SCWS slightly by presenting a link on the user’s home page named ‘‘e-voting’’. • The user clicks on the link. • The user’s name and ID card number are encrypted with PUB1; additionally both of them are hashed to produce the hash HA and signed with PA2. All these are sent to the government’s remote server that hosts the e-voting site. ðIDA ; NameA ÞPUB1 ðH ðIDA ; NameA ÞÞPA2 ! Voting Server • The remote server decrypts (IDA, NameA) PUB1 using PB1, extracts PUA2 from CertA2 (which it already knows), verifies (H (IDA, NameA)) PA2 using PUA2 (and gets HA), hashes the (IDA, NameA) using H (and gets HB), and checks HA against HB. If the two hashes much each other, the server authenticates the user and may proceed with the rest of the process. Then, it checks if the citizen has voted again and if not it creates a temporary entry in a database to show that the user’s voting is in progress. • After the user is authenticated to the server, he is presented with a link that points to the IP of the SCWS. The user clicks on the link and he is transferred to the SCWS environment. At the same time the remote server hashes the (IDA, NameA), sends it signed with PB2 and also sends an encrypted link L which has embedded authentication data that will be used later on from the user (to authenticate himself on the remote site instead of providing a username/password): ðHðIDA ; NameA ÞÞPB2 ðLÞPUA1 Voting Server • The SCWS receives the (H (IDA, NameA)) PB2 and verifies it using PUB2. Then it hashes the user’s ID and name that are stored on the SIM card with H and if the two hashes much each other, the server is authenticated and the SCWS can now prompt the user to provide the PIN. Additionally, the SCWS decrypts L using PA1. • The user provides the PIN, it is checked by the SCWS and if it is correct the SCWS displays the link L that points to the remote server. • The user clicks on the link and can now browse the voting site and vote. When his voting is done, the permanent entry in the database is updated, to show that the user has voted.

19.3.2 Comments on this Use Case Someone can argue about the need to use an SCWS for e-voting. While this document cannot explore the law and ethical issues that arise because of the

226

L. Kyrillidis et al.

sensitive nature of the elections, there are some reasons that can justify its use. The first is that a large part of the population is familiar with using a browser by using it in its day-to-day internet access. However, it is fairly easy for people to learn how to use it, even if they do not have previous experience. Another important reason is that the security needed for the e-voting (and other similar uses like e-shopping) can be provided by using the SCWS. The SIM card is the most secure token in mass production at the moment and can easily store all the sensitive information needed (certificates, keys, personal information). After all, even if a phone or SIM card is lost or stolen, it will be quite difficult for someone to extract the necessary information and by the time that he manages to do so, the certificates/keys will, most probably, be revoked. A third reason is the transparency of the process. As mentioned earlier, the user needs to know how to use a browser and nothing more. All the necessary message exchange takes place without user interaction except when entering the PIN number and this allows for more complex protocols and longer cryptographic keys to be used. The most difficult part of the overall process is when it comes to define who will be responsible for storing all these certificates/keys on the SIM card. Are the mobile operator companies trusted to install this sensitive information on the SIM cards, and in case they are not, will they allow the government to use their facilities? What happens with lost/stolen phones (revoking of the certificates), or simply when a user changes phone and/or network provider? Additionally, it must be ensured that everything runs smoothly, so that the election result is not disputed and that the legitimate user can vote only once. This is quite a challenge as although the Internet is used by more and more people for all kinds of different purposes, it is far from being characterized as a secure environment. The ability that it provides for shopping, communicating, etc. intrigues malicious people and offers them a whole new environment where they can launch their attacks against unwary users. A number of malicious programs are created every day, including Trojan horses, viruses, rootkits and other attack software that is used for data theft, communications corruption, information destruction, etc. Although the SIM card is designed as an attack/tamper-resistant platform, extending its ability to serve HTTP(s) requests will make the SIM card and mobile phone even more attractive attack targets.

19.3.3 Secure Communication Channel There are two ways for the SCWS to communicate with entities outside of the SIM card environment. The first way is when the communication is between the SCWS and a remote server in order for the former to receive updates and the second one is the communication with the phone’s browser when the user submit requests to the SCWS. Both communication mechanisms need to be protected adequately.

19

Smart Card Web Server

227

The communication between the SCWS and the remote server is of vital importance, because it provides the necessary updates to the SCWS from a central location. The symmetric cryptography can provide the necessary level of security through the use of a pre-shared symmetric key [8]. The key has to be strong (long) enough so that even if the communication is eavesdropped, an attacker cannot decrypt or alter it. In addition, this offers mutual authentication, because the key is only known to the two entities, thus every message encrypted by that key can only come from a trusted entity. The security of communication between the SWCS and the browser is also very important and the necessary level of protection can be provided with in a variety of ways. As with the traditional Internet, the user can either use the HTTP or the HTTPs protocols to communicate with the SCWS. If the browser on the mobile phone requests information that needs little or no security, the communication can pass over the HTTP protocol, while communication that is sensitive is protected via HTTPs, thus offering confidentiality, integrity and authentication. Another security measure that can offer a second level of security is the use of the PIN to authenticate the user to the SCWS. The final measure that can be utilized is through the use of some form of ACL that will allow applications meeting certain trust criteria, to access and communicate with the SCWS, while blocking non trusted applications.

19.3.4 Data Confidentiality/Integrity The SCWS handles two kinds of data: data stored on the SIM card and data in transit. The first kind of data has an adequate level of protection as modern smart cards are designed to strongly resist unauthorized access to the card data. An attacker should need costly and advanced equipment, expert knowledge and a lot of time, as modern smart cards have many countermeasures to resist known attacks [9]. Data in transit cannot benefit from the protection that the card offers and is more exposed to attacks. If data in transit does not pass over secure channels with the use of the necessary protocols, this may lead to data that is altered, destroyed or eavesdropped. Measures must be taken so that these actions are detected and if possible, prevented. An attacker may alter data in two ways: by just destroying a message (or transforming it into a meaningless one) or by trying to produce a new version of the message with an altered, meaning. The first attack simply wants to ‘‘break’’ the communication, while the second one aims to exploit a weakness e.g. to execute malicious commands against the server. It is obvious that the latter is far more difficult, especially when the message is encrypted or hashed. In the SCWS context, the messages are mostly the commands exchange between the SCWS and the remote server for remote administration or between the SCWS and the user’s browser. The alteration of the exchanged messages can be avoided by adding a

228

L. Kyrillidis et al.

MAC at the end of the message when there is a pre-shared key (in the case of the remote administration) or by using digital signatures when a pre-shared key cannot be (securely) exchanged. Using any form of strong encryption can provide the necessary confidentiality needed for the data that is handled by the SCWS and symmetric or public-key cryptography can be used according to needs. OMA proposes the use of PSK-TLS for confidentiality/integrity between the SCWS and the remote server and public key cryptography for the communication between the SCWS and the various applications on the phone. On the second case the use of PSK-TLS is optional.

19.3.5 Authentication For authentication purposes, OMA proposes the use of Basic Authentication and optionally the use of Digest Authentication [10]. While the former can be used when there is no or little need for authentication, in case that an application/entity needs to authenticate the SCWS and vice versa, the use of Digest Authentication is mandatory.

19.4 Management Issues Managing a web server is a complicated task, because of all the different possibilities that exist for setting it up and tuning it. On top of that, the administrator must pay attention to its security and implement necessary countermeasures so that the server is not an easy target to possible attacks. Additionally, he must pay attention to setup correctly the scripting server-side/ scripting language(s) that the web server will use in order to avoid setup mistakes that affect a large number of servers e.g. PHP’s register_global problem [11]. Therefore to setup a web server correctly, an administrator must be aware of all the latest vulnerabilities, which could be quite a challenging task, especially as correcting a wrong setup option may sometimes lead to corrupted programs/websites that do not work [12]. The SCWS is not affected by these issues. Most probably it will be setup from a central location (the network operator’s facilities), and must be carefully managed, because otherwise it may introduce vulnerabilities to attacks against it and against the SIM card itself. Most probably the SCWS will have a common setup for all its instances. An important problem is what to do with older SIM cards that may not be able to host a program like a web server (even a ‘‘lightweight’’ web server like the SCWS). This may lead to a large number of users being unable to have the SCWS installed. Finally, any patches/updates needed for the SCWS will be installed from a central location, meaning that this adds another burden to the network provider.

19

Smart Card Web Server

229

19.5 Personalization At the beginning, the Internet was a static environment with content that was presented to the user ‘‘as is’’, without interaction. However, since the arrival of Web 2.0 this has changed radically; now it is often that the user ‘‘creates’’ and personalizes the content [13]. Social networking sites, blogs and other internet sites allow a user to be an author, to present his photographic skills, to communicate with people from all over the world, to create the content in general. This advance in Internet interactivity, allowed a number of companies, to approach the user offering services or products that were of interest according to Internet ‘‘habits’’ e.g. a person that visits sports sites would be more interested in sports clothing than a person that visits music sites. The idea of the personalized content can be applied to the SCWS environment as well. Network companies may offer services that interest their clients based not only on their needs/interests, but e.g. if a client is in a different country, the company may provide information about that place (museums, places of interest, hospitals, other useful local information) and send this information to the SCWS. The user can then easily, using his phone browser, access the SCWS content, even if the user is offline (not connected to the internet). One issue is to define who will be the creator of the content. Will this be the network provider or will third parties be allowed to offer content as well? Giving access to a third party may be resisted for business reasons and also the potential for undermining the security of the platform. From a practical viewpoint, multiple applications are not too much of a challenge for the modern SIM card platform, as it is designed to permit third parties to install, manage and run applications.

19.6 Web Server Administration The administration of a web server can be quite a challenging task, due to the fact that the server must run smoothly, work 24/7/365 and serve from a few hundred to even million requests (depending on the sites that is hosts). The SCWS will not be installed in a central location like a traditional web server, but rather it will be installed in a (large) number of phones. The web server can come pre-installed on the SIM card, and when an update/patch must take place this can happen with one of the following two ways: either centrally with mass distribution of newer versions/updates or by presenting a page to the user, for self download and install. Third party applications that may be installed and run as part of the web server can be administered in the same way.

230

L. Kyrillidis et al.

19.7 Web Server’s Processing Power and Communication Channel Speed Depending on the need, a web server can be slow or fast. A server that has to support a hundred requests does not need the same bandwidth or the processing power as one that serves a million requests. The supporting infrastructure is of huge importance, so that users’ requests are handled swiftly even in the case of some server failures. Additionally, if the performance of the communications bearer does not match the processing power of the web server itself (and vice versa), the user’s perspective of the overall performance will be poor. The SCWS will not serve requests for more than a user (if we assume that the SCWS is only accessible from inside the phone), so one can say that processing power or the communication channel is not of huge importance. However, even if the SCWS has to respond to only one request at a time, this may still be a demanding task if the processing power of the SIM card is still small, especially if the SCWS has to serve multimedia or other resource consuming content. The same concern applies to the communication channel speed. A traditional web server may have a fast line (or maybe more than one) along with backup lines to serve its requests. The SCWS cannot only rely on the traditional ISO 7816 interface [14]. This interface that exists in most of the phones at the moment is too slow to serve incoming and outgoing requests to and from the SCWS. This need is recognized by ETSI and it is expected in the near future that the 7816 interface will be replaced with the (much) faster USB one [15]. So, the necessary speed can be provided only when the USB interface is widely available.

19.8 IP Mobility A web server that changes IP addresses is a web server that may not be as accessible as it must be, because every time that its IP address is changed a number of DNS servers must be informed and their databases to be updated. This update may require from several hours to a couple of days. This is the reason that all web servers have static IP addresses [16], so that every time that a user enters a URL, he knows that there will be a known match between that URL and an IP address. At the beginning the SCWS will not be as accessible as a traditional web server. This is because (as mentioned before) it will serve requests initiated from one user only, as it will be accessible only from the inside of the phone and most probably its IP address will be the 127.0.0.1 (loopback address). However, if the SCWS becomes accessible to entities from outside of the phone, this means that it cannot answer to requests destined to the loopback address only and it will need a public accessible address. Although this can be solved and each SCWS can have a static IP address, what will happen when the user is travelling? After all, IP address ranges are assigned to cities or countries and so a PC has an IP within a certain range when being in London, UK and another when being in Thessaloniki, Greece.

19

Smart Card Web Server

231

A phone is a mobile device which is often transferred between cities, countries or even continents and so it needs a different IP address every now and then. This means that if the SCWS becomes public accessible and serves requests to entities from outside the phone, there must be a way to permanently match the SWCS’s URL to a certain IP. While this cannot happen because we are talking about a mobile device, the answer to this problem can be found within RFC 3344 and RFC 3775 (for mobile IPv4 and IPv6 respectively). Briefly, the two RFCs use a home address and a care-of-address. The packets that are destined for the home address are forwarded to its care-of-address (which is the phone’s current address). This binding requires co-operation between the network providers, but if it is setup correctly it can enable the phone to move without problems and permit the SCWS to serve requests smoothly [17, 18].

19.9 Conclusion To predict the future of the SCWS is not an easy task, but it surely may be interesting. Obstacles associated with the SIM cards’ limited processing power and low bandwidth communication channel may be overcome by advances in current technology. Newer SIM cards have bigger memory capabilities; faster processing units and the USB interface will provide the necessary communication speed. However, technical issues alone are unlikely to decide the future of SCWS, as this will be primarily determined by the network providers based on profitability, potential security vulnerabilities and user acceptance.

References 1. Rees J, Honeyman P (1999) Webcard: a java card web server. Center for Information Technology Integration, University of Michigan 2. Open Mobile Alliance. http://www.openmobilealliance.org/ 3. OMA, Enabler Release Definition for Smartcard-Web-Server Approved Version 1.0-21 April 2008, OMA-ERELD-Smartcard_Web_Server_V1_0-20080421-A 4. OMA, Smartcard Web Server Enabler Architecture Server Approved Version 1.0-21 April 2008, OMA-AD-Smartcard_Web_Server_V1_0-20080421-A 5. OMA, Smartcard-Web-Server Approved Version 1.0-21 April 2008 OMA-TS-Smartcard_ Web_Server_V1_0-20080421-A 6. ETSI TS 102 223 7. AS Sertifitseerimiskeskus, The Estonian ID Card and Digital Signature Concept Principles and Solutions, Version 20030307 8. Menezes AJ, van Oorschot PC, Vanstone SA (1996) Handbook of applied cryptography. CRC Press, USA, pp 15–23, 352–359 9. Rankl W, Effing W (2003) Smart card handbook, 3rd edn. Wiley, New York, pp 521–563 10. Internet Engineering Task Force, HTTP Authentication: Basic and Digest Access Authentication. http://tools.ietf.org/html/rfc2617 11. PHP Manual, Using Register Globals. http://php.net/manual/en/security.globals.php

232

L. Kyrillidis et al.

12. Esser S, $GLOBALS Overwrite and its Consequences. http://www.hardened-php.net/ globals-problem (November 2005) 13. Anderson P (2007) What is Web 2.0? Ideas, Technologies and Implications for Education. Technology & Standards Watch, February 2007 14. Mayes K, Markantonakis K (2008) Smart cards, tokens, security and applications. Springer, Heidelberg, pp 62–63 15. ETSI SCP Rel.7 16. Hentzen W, DNS explained. Hentzenwerke Publishing. Inc, USA pp 3–5 17. Internet Engineering Task Force, IP Mobility Support for IPv4, http://www.ietf.org/ rfc/rfc3344.txt 18. Internet Engineering Task Force, Mobility Support in IPv6, http://www.ietf.org/rfc/ rfc3775.txt

Chapter 20

A Scalable Hardware Environment for Embedded Systems Education Tiago Gonçalves, A. Espírito-Santo, B. J. F. Ribeiro and P. D. Gaspar

Abstract This chapter presents a scalable platform designed from scratch to support teaching laboratories of embedded systems. Platform’s complexity can increase to offer more functionalities in conjunction with student’s educational evolution. An I2C bus guarantees the continuity of functionalities among modules. The functionalities are supported by a communication protocol presented in this chapter.

20.1 Introduction Embedded systems design plays a strategic role from an economic point of view and industry is requiring adequately trained engineers to perform this task [1, 2]. Universities from all over the world are adapting their curriculums of Electrical Engineering and Computer Science to fulfill this scenario [3–7]. An embedded system is a specialized system with the computer enclosed inside the device that it controls. Both, low and high technological products are built following this concept. T. Gonçalves (&) A. Espírito-Santo B. J. F. Ribeiro P. D. Gaspar Electromechanical Engineering Department, University of Beira Interior, Covilhã, Portugal e-mail: [email protected] A. Espírito-Santo e-mail: [email protected] B. J. F. Ribeiro e-mail: [email protected] P. D. Gaspar e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_20, Springer Science+Business Media B.V. 2011

233

234

T. Gonçalves et al.

The complexity of an embedded system changes from one product to another, depending on the task that it must perform. Therefore, embedded system designers must have knowledge from different areas. The development of hardware projects requests knowledge related with digital and/or analog electronics, and, at the same time, with electromagnetic compatibility issues, that cannot be forgotten in high frequency operation or in products that must work in very restrictive environments, as the ones found in hospitals. Therefore, the designer must project the firmware, required by the hardware, allowing it to work as expected. Subjects like operating systems, real time systems, fixed and floating-point arithmetics, digital signal processing, and programming languages as assembly, C/C++ or Java are of major relevance to the development of embedded systems. Curriculums to teach embedded systems are not well established, unlike the classic knowledge areas where is possible to find textbooks to support students study. Different sensibilities are used to structure curriculums in embedded systems. The ARTIST team has established the competencies required by an embedded software engineer. This proposal highlights practices as an essential component of education in embedded system [8]. As stated previously, the skills that an embedded systems designer must hold are highly complex and spread across different areas [9, 10]. If beyond this knowledge, the student needs also to learn how to work with a complex development kit, then he/she will probably fail. Even if the student is adequately prepared in the essential subjects, the time taken to obtain visible results is, sometimes, responsible for student demotivation and consequent failure.

20.2 Hardware Platform Design Overview The development of an embedded system relies on a large and assorted set of technologies in constant and rapid evolution. The learning platform here described wish to improve student accessibility to this set of technologies in an educational environment [11]. This way, the usage of commercial evaluation kits are discouraged, since they are mainly developed to observe the potentialities of a specific product without educational concerns. The MSP430 family selection is justified by the amount of configurations available, with a high count of peripherals, different memory sizes, and popularity. Another relevant attribute is the learning curve of these devices that, from authors’ experience, allows to rapidly obtain results [12]. The developed platform has actually four modules (see Fig. 20.1), with increasing complexity, but others can be developed in the future. The learning platform can thus satisfy the needs from beginners to experienced students, and, at the same time, has a high potential of evolution. The learning platform here presented allows users to experiment with different kinds of interfaces, such as OLED display, seven-segment displays and conventional LED. Students can interact with several other peripherals, as for example, among others: pressure switches, a touch-screen pad, a joystick, and an

20

A Scalable Hardware Environment for Embedded Systems Education

235

Fig. 20.1 Example of one allowed configuration—Module 0 and Module 1 connected

accelerometer. An overview structure of the developed modules can be observed in Fig. 20.2. The modules were constructed around three devices from the MSP430 family: MSP430F2112, MSP430FG4618 and MSP430F5419. The strategy adopted to design the teaching platform took in consideration the following characteristics: • Modules can exchange data through an I2C bus. • Each module has two microcontrollers, one manage the communications, while the other it is available for student’s work. • A SPI bus connects both processors in the same module. • Energy consumed in each module can be measured and displayed in real time. • All modules have standard dimensions. Master

Slave1

Module function

Slave 2 Module function

Module function

SPI

SPI

SPI Router Buffers

Tx Buffer

Tx

Rx

I2C

Rx Buffer

Tx Buffer

Rx Buffer

Tx Buffer

I2C

I2C Bus

Fig. 20.2 Structure overview of the teaching environment for embedded system

Rx Buffer

I2C

236 Fig. 20.3 Communication infrastructure

T. Gonçalves et al. Program and debug

SPI 2 Reset UART

Power MSP430F2112

4

I/O

JTAG

I2CAdd UART

I2C

ADC

4

4

2

BCD Encoder

2º Order filters S1 Current sensors

S2

Aux1 Auxiliary Aux2 inputs

Expansion Bus

The communication management infrastructure shown in Fig. 20.3 implements the protocol described in Sect. 20.4. All the functionalities specified for the Module 0—Basic Interface and Power—and for the Module 1—Basic Interface— are implemented with the MSP430F2112 microcontroller. This device can operate with a maximum clock frequency of 16 MHz, it has 32 kB of flash memory, and 256B of RAM, one 10-bits ADC, two timers with respective compare/capture units, and a number of digital IO sufficient to satisfy the needs. The device also supports SPI, UART, LIN, IrDA, and I2C communication protocols. A BCD encoder sets the address of the module in the I2C bus. Four LED are used to show the communication status. A pressure switch can reset the communication management hardware. An SPI channel connects the user’s microcontroller and the microcontroller used to manage the communications, which in turn is connected to the I2C bus. The MSP430 family has good performance in situations where the energy consumption is a major concern. The energy consumed by each module can be monitored and displayed in real time. Current is measured at two distinct points. At measuring point S1 the current of the user’s microcontroller is acquired with a current shunt monitor (INA198). The total current required by the whole module is acquired at the measuring point S2 with a precision resistance. Two other analogue inputs are also available. The student can use them to acquire the working module voltage. The energy consumed is computed with the knowledge of the current and voltage.

20.3 Platform Modules in Detail 20.3.1 Module 0: Basic Interface and Power This module was developed to help students’ first steps in embedded systems. Usually, this kind of user does not have any experience with the development of embedded systems. This module will help him to take contact with the

A Scalable Hardware Environment for Embedded Systems Education

Fig. 20.4 Module 0—structure of the basic interface and power module

237

BCD Encoder

Battery Exter. Power JTAG

Program and debug SEL Current sensor

DC-DC Voltage regulator

S1

JTAG

I/O

MSP430F2112

BCD Encoder

4

I/O

20

4

BCD Decoder

DATA 4

7 Current sensor S2

UART

I/O

SPI 4

Communication management

ENDIS 4

I2C 2 Expansion Bus

microcontroller architecture and, at the same time, with the software development tool. The internal structure of the module is illustrated in Fig. 20.4. Beyond providing power to itself, the module can also power the modules connected to it through the Expansion Bus. Three different options are available to power the system: a battery, an external power source, or the JTAG programmer. The DC–DC converter allows an input range from 1.8 to 5.5 V, providing a 3.3 V regulated output. Powering from the JTAG is only available to support programming and debugging activities. A numerical display was built with four seven-segment independent digits. Four data lines (DATA) control the writing operation. A BCD decoder allows writing the desired value in the display. The selection of which digit will be written, at a specific moment, is performed by four control lines (ENDIS). Because the BDC decoder does not latch the output, the microcontroller must continually refresh the value to exhibit in the display with a minimal frequency of 15 Hz. Despite the simplicity of this module, its predefined task is the visualization of the current and the energy consumed by each one of the modules connected by the Expansion Bus. Two BCD encoders, with ten positions each, are connected to the microcontroller through eight selection lines (SEL), and are used to select from which module the information will come. To execute this feature, the user’s microcontroller must be loaded with a specific firmware.

20.3.2 Module 1: Basic Interface This module is directed to the student that already has some basic knowledge in the embedded systems field. As the previous module, this is also based in the MSP430F2112 microcontroller. Connected to this device, as can be seen in Fig. 20.5, can be found eight switches, eight LEDs, and a seven-segment display with two digits.

238

T. Gonçalves et al.

Fig. 20.5 Module 1—structure of the basic interface module

Program and debug JTAG MSP430F2112 UART

4 4

I/O

Power

DATA

BCD Decoder

I/O

7

Sw1 SPI 2

7

Buffer/ Driver 8

Vcc

S2 Sw8 S1

Current sensor

Current sensor

BCD Decoder

Communication management I2C 2 Expansion Bus

This module intends to develop student’s competences related with synchronous and asynchronous interruptions. Simultaneously, the student can also explore programming techniques, as for example, the ones based in interrupts or port polling to check the status of the digital inputs or impose the status of the digital outputs. With this module, the student can have the first contact with the connection of the microprocessor to other devices, compelling him to respect accessing times. Two BCD decoders, with the ability to latch the outputs, are used to write in the display. While the DATA lines are used to write the value in the display, the LE lines are used to select which digit will be written. The DATA data lines are also used, with a buffer/driver, to turn on or off each one of the eight LED. The user can configure the inactive state of the switches.

20.3.3 Module 2: Analog and Digital Interface Students with more advanced knowledge in the embedded systems field can use this module to improve their capabilities to develop applications where human– machine interface, analogue signal conversion, and digital processing are key aspects. The design of this module was performed around the MSP430FG4618 microcontroller. This device can operate at maximum clock frequency of 8 MHz, it has 116 kB of flash memory and 8 kB of RAM. The high count of digital IO is shared with the on-chip LCD controller. Other peripherals that are normally used for analog/digital processing are also present: 12-bits ADC, two DAC, three operational amplifiers, DMA support, two timers with compare/capture units, a high number of digital IO, and hardware multiplier. This device can also support SPI, UART, LIN, IrDA, and I2C communication protocols. On-chip LCD controller allow the connection of LCD with 160 segments. On-board can be found a navigation joystick with four positions and a switch, a rotational encoder with 24 pulses/turn and a switch, a speaker output, a microphone input, generic IO, and a

A Scalable Hardware Environment for Embedded Systems Education

Fig. 20.6 Module 2—structure of the analog and digital interface module

239

Program and debug SEG

JTAG

Power

LCD Controller

20

MSP430FG4818

UART

AmpOp

32 COM 4

I/O OUT

SPI 2

Digital I/O

IN

S2 S1

Current sensor

Current sensor

Communication management

IO PORT 20

Micro

NAV 5

ROT 3

Joystick

Rotational Encoder

sw1

sw2

Speaker

I2C 2 Expansion Bus

alphanumeric LCD. The internal structure of the module 2—Analog and digital interface—is illustrated in Fig. 20.6. Students can explore the operation of LCD devices, taking advantage from the on-chip LCD controller. The module can also be used to improve the knowledge related with the development of human–machine interfaces. An example of a laboratory experience that students can perform with this module is the acquisition of an analogue signal, condition it with the on-chip op-amps, and digitally process the conversion result with a software application. The result can be converted again to the analog world using the on-chip DAC. Taking advantage from the on-chip op-amps it is possible to verify the work of different topologies, as for example: buffer, comparator, inverter non-inverter, and differential amplifier with programmable gain. The digitalized signal can be processed using the multiply and accumulate hardware peripheral. Students can conclude about the relevance of this peripheral in the development of fast real time applications.

20.3.4 Module 3: Communication Interface This is the most advanced module. With this module, the student has access to a set of sophisticated devices that are normally incorporated in embedded systems. The student can explore how to work with: an OLED display with 160 9 128 pixels and 2,62,000 colors, a three axis accelerometer, a SD card, a touch-screen, an USB port, two PS/2 interfaces. The module also owns connectors to support the radio frequency modules Chipcon-RF and RF-EZ430. The Module 3—Communications interface module—illustrated in Fig. 20.7, was built around a microcontroller with high processing power. The MSP430F5419 can operate with a maximum clock frequency of 18 MHz, and it has 128 kB of flash memory and 16 kB of RAM. This microcontroller also has a hardware multiplier,

240

T. Gonçalves et al.

RF-EZ430

Chipcon RF

TX/RX 2

SPI 4

USB Controller

SUSPUSB

2 2 TX/RX

I2C

MSP430F5419

S1

Current sensor

SPI 4

I/O USCIB1

8

M_CLK

M_DATA Level Shifter

M_CLK

Level Shifter

KB_DATA

KB_CLK KB_DATA

PS2 Connector

KB_CLK PS2 Connector

VTSCEN

2 I2C PENIRQ

Communication management

SD Connector

SPI 4 M_DATA

I/O

USCI??

OLED Display

Data

I/O

Power

SPI 4

Timer B

JTAG

Program and debug

USCIA1 I/O USCIB0

VOLEDEN

USCIA3 USCIA0 USCIA3 USCIA3

Current sensor S2

EPROM

I2C INTACC

I2C 2

TouchScreen Controller

4 wire TouchScreen

3 axis Accelerometer

VACCEN Expansion Bus

Fig. 20.7 Module 3—structure of the communication interface module

a real time clock, direct memory access, and a 12-bits ADC. The communication protocols SPI, UART, LIN, IrDA, and I2C can be implemented in four independent peripherals. The device has three independent timers with compare/capture units. The accelerometer MMA7455L, with adjustable sensibility, uses an I2C interface to connect with the microcontroller in the port USCIB1. It has two outputs that can signal different conditions, like data available, free fall, or motion detection. This device can be enabled by the microcontroller through the VACCEN line. The resistive touch-screen with four wires, and a 6 9 8 cm area, needs a permanent management of their outputs. To free the microcontroller from this task, a touch-screen controller is used to connect it with the microcontroller through I2C bus (USCIB1). The line PENIRQ notifies the microcontroller that the

20

A Scalable Hardware Environment for Embedded Systems Education

241

touch-screen is requesting attention. To save power, the microcontroller can enable or disable the touch-screen controller through the VTSCEN line. The two PS/2 ports are connected to the microcontroller by a bidirectional level shifter to adapt the working voltage levels from 5 to 3.3 V. The clock signal is provided by the Timer B. The communication port USB uses a dedicated controller that makes the interface with the UART (USCIA2). The USB controller firmware is saved in an EPROM that can communicate with the USB controller or with the microcontroller by an I2C bus (USCIB3). The USB can be suspended or reseted by the microcontroller though the line SUSPUSB. The SD socket is only a physical support to allow the connection of the card to the microcontroller through a SPI (USCIA1). Two additional lines enable detection and inhibition of writing operations. The OLED display support two different interface methods: 8-bits parallel interface, or SPI (USBI0). The parallel interface requests a specific software driver. A simpler interface can be implemented with the SPI bus. The radio frequency interfaces allow the connection of a RF module from the Chipcon, connected to the microcontroller by an SPI interface (USCIA3). The RF-EZ430 radio frequency module can be connected through an UART (USCIA0).

20.4 Communications Protocol Data exchange between modules is based upon two communication methods: (1) the communication method between the module application function and the network function; and (2) the communication method supported by the bus, which has the main task of interconnecting all modules, working as router among them. This characteristic gives to the system a high versatility, allowing increasing the set of applications supported. The adopted solution was a serial bus, being the key criteria of this choice: the physical dimensions, the maximum length of the bus, the transmission data rate and, of course, the availability of the serial communications interface. The I2C communication protocol is oriented to master/slave connection, i.e., the exchange of data will always occur through the master. This leads to the definition of two distinct functional units. Although, this technology supports multi-master operation, it was decided to use a single master, giving to the network a hierarchical structure with two levels. All the possibilities of information transaction at the bus physical level between the master and the slave are represented in Fig. 20.8. The master begins the communication sending a start signal (S), followed by the slave address (Adress X). The slave returns the result to the master after the reception and execution of the command. The procedure to send and receive data from and to the slave is also represented. The master starts the communication sending the start signal (S), followed by the slave address. The kind of operation to perform (read or write) is sent next. The ACK signal sent by the I2C controllers is not represented in the figure.

242

T. Gonçalves et al.

Master sends command to slave I2C S

Slave Address X

W

Command

S

Data

P S

Slave Address X

R

Command return

P S

Master gets data from slave I2C S

Slave Address X

R

S

Bus start condition

P S

Bus start or stop condition

W

Bus write operation

R

Bus read operation

Master sends data to slave I2C S

Slave Address X

W

Data

P S

Fig. 20.8 Typical master/slave data exchange at the I2C bus level

The module 0—Basic interface and power—performs the master role. This choice is justified by the fact that this module has an obligatory presence in the network. If slave unit B wants to send a message to the slave unit A, first the message must be sent to the master, which will resend it to the destination unit. To satisfy the specifications, a communication frame was defined as can be observed in Fig. 20.9. In order to implement the communication service layer, a media access protocol was outlined with the mechanisms required to exchange information between modules. Therefore, the service frame and the communication protocols between the master unit and the slaves are defined as follows. The service frame has three different fields. The header field, with two bytes length, has the information about the message type. The routing information includes sender and receiver addresses, and information about the length of the payload in bytes. The payload field is used to carry the information from one module to another. Finally, the checksum field, with one-byte length, controls communications integrity. The service layer uses a command set with three main goals: network management and setup; information transaction tasks synchronization; and measurement. The service layer protocol has three different types of frames. The data frame is used to transport the information between the applications running in the modules. The command frame establishes the service layer protocol. The message type is specified in the sub-field frame type of the header field as reported in Table 20.1. The command set is listed in Table 20.2. At power on, the master will search for slaves available to join the network. The master achieves this task sending the command ‘‘Get_ID request’’ for all available slave address, which will be acknowledged by the slaves present responding the command ‘‘GET_ID response’’.

20

A Scalable Hardware Environment for Embedded Systems Education

Fig. 20.9. General frame format

3Bit

243

1Bit

Frame type

Data pending

4 Bit

4 Bit

4 Bit

4 Bit

Message Control

Source Adress

Destination Adress

Data Length

Header

Payload

3 Byte

Command

CheckSum

Max 8 byte

1Byte

Command data

Or

High layer data

Table 20.1 Values of the frame type subfield

Table 20.2 Command frame

Frame type value b2 b1 b0

Description

000 001 010 011 100–111

Reserved Data Acknowledgment MAC command Reserved

Command frame identifier

Command name

Direction

0x00 0x01 0x02 0x03 0x04 0x05 0x06 0x07

Heart _Beat Request Heart_Beat Response Get_ID Request Get_ID response Software Reset Request Get_Status Request Req_Data Request Req_Data Response

M -[ S S -[ M M -[ S S -[ M M -[ S M -[ S M -[ S S -[ M

After the successful setup of the network, the master periodically executes the data polling operation. The task checks all slaves for pending messages. This operation allows data exchange among all the units connected by the I2C bus. The memory available in the master to support the communication task is limited,

244

T. Gonçalves et al.

1- >M 2- >M 3-> 1

3 M

M 3

M 2

1 M

Get data from Slaves Send data to slaves

2- >M

1->3

3 M

M 2

M 1

Polling Period

3-> 2

2->M

1 3

1 2

Input buffers (End period)

Slave3

Master M

Slave1

Slave2

Slave3

3 2

Master M

Slave1

Slave2

Slave3

2->M 1->M

Master send command to slave

Message from S3 to S2

Message from S1 to S3

Message from S1 to S2

Message from S2 to M

Message from S3 to S1

3 1

Slave1

2 M

1-> 2

Slave2

1 M

Master M

Slave1

Slave2

Slave3

Slave1

Slave2

1->2 1->3

Slave3

Slave1

2 ->M

3-> 1 3-> 2

Master M (middle point)

Slave2

Slave3

Master M

Output buffers (Begin period)

Fig. 20.10 Polling data operation

furthermore, the data polling operation must ensure low access times to the bus for all units connected to it. The pooling operation is illustrated in Fig. 20.10, where the data flow through the bus is represented. The data polling operation is carried out in three different phases. (i) Slaves are sequentially scanned by the master searching for messages ready to be transferred. (ii) During the router phase the master inspects the field address from each message in the input buffer, if the message has a slave as the destination then it will be transferred to the output buffer, if the message has the master as the destination, then it will be sent to the user’s microcontroller. (iii) The data polling operation is finished by the master sending all messages in the output buffer to the respective slaves.

20.5 Conclusion This chapter presents the development of a platform to support the teaching of embedded systems. The modules here presented allow the implementation of different experimental laboratories, with increasing level of difficulty. At the same time, the student has access to key technologies related with the development of embedded systems. The presence of an expansion bus gives a high versatility to the learning platform, because, it allows the development of new modules. This versatility is further enhanced by the existence of a communication bus that turns possible data exchange between modules.

20

A Scalable Hardware Environment for Embedded Systems Education

245

References 1. Choi SH, Poon CH (2008) An RFID-based anti-counterfeiting system. IAENG Int J Comput Sci 35:1 2. Lin G-L, Cheng C-C (2008) An artificial compound eye tracking pan-tilt motion. IAENG Int J Comput Sci 35:2 3. Rover DT et al (2008) Reflections on teaching and learning in an advanced undergraduate course in embedded systems. IEEE Trans Educ 51(3):400 4. Ricks KG, Jackson DJ, Stapleton WA (2008) An embedded systems curriculum based on the IEEE/ACM model curriculum. IEEE Trans Educ 51(2):262–270 5. Nooshabadi S, Garside J (2006) Modernization of teaching in embedded systems design—an international collaborative project. IEEE Trans Educ 49(2):254–262 6. Ferens K, Friesen M, Ingram S (2007) Impact assessment of a microprocessor animation on student learning and motivation in computer engineering. IEEE Trans Educ 50(2):118–128 7. Hercog D et al (2007) A DSP-based remote control laboratory. IEEE Trans Ind Electron 54(6):3057–3068 8. Caspi P et al (2005) Guidelines for a graduate curriculum on embedded software and systems. ACM Trans Embed Comput Syst 4(3):587–611 9. Chen C-Y et al (2009) EcoSpire: an application development kit for an ultra-compact wireless sensing system. IEEE Embed Syst Lett 1(3):65–68 10. Dinis P, Espírito-Santo A, Ribeiro B, Santo H (2009) MSP430 teaching ROM. Texas Instruments, Dallas 11. Gonçalves T, Espírito-Santo A, Ribeiro BJF, Gaspar PD (2010) Design of a learning environment for embedded system. In: Proceedings of the world congress on engineering 2010, WCE 2010, 30 June–2 July, 2010, London, UK, pp 172–177 12. MSP430TM16-bit Ultra-Low Power MCUs, Texas Instruments. http://www.ti.com

Chapter 21

Yield Enhancement with a Novel Method in Design of Application-Specific Networks on Chips Atena Roshan Fekr, Majid Janidarmian, Vahhab Samadi Bokharaei and Ahmad Khademzadeh

Abstract Network on Chip (NoC) has been proposed as a new paradigm for designing System on Chip (SoC) which supports high degree of scalability and reusability. One of the most important issues in an NoC design is how to map an application on NoC-based architecture in order to satisfy the performance and cost requirements. In this paper a novel procedure is introduced to find an optimal application-specific NoC using Particle Swarm Optimization (PSO) and a linear function which considers communication cost, robustness index and contention factor. Communication cost is a common metric in evaluation of different mapping algorithms which have direct impact on power consumption and performance of the mapped NoC. Robustness index is used as a criterion for estimating faulttolerant properties of NoCs and contention factor highly affects the latency, throughput and communication energy consumption. The experimental results on two real core graphs VOPD and MPEG-4 reveal the power of proposed procedure to explore design space and how effective designer can customize and prioritize the impact of metrics.

A. R. Fekr (&) M. Janidarmian CE Department, Science and Research Branch, Islamic Azad University, Tehran, Iran e-mail: [email protected] M. Janidarmian e-mail: [email protected] V. S. Bokharaei ECE Department, Shahid Beheshti University, Tehran, Iran e-mail: [email protected] A. Khademzadeh Iran Telecommunication Research Center, Tehran, Iran e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_21, Ó Springer Science+Business Media B.V. 2011

247

248

A. R. Fekr et al.

21.1 Introduction Due to ever-increasing complexity of system on chip (SoC) design, and nonefficiency of electric bus to exchange data between IP cores in giga scale, the Network on Chip (NoC) is presented with more flexible, scalable and reliable infrastructure. Different mapping algorithms for NoCs are presented to decide which core should be linked to which router. Mapping an application to on-chip network is the first and the most important step in the design flow as it will dominate the overall performance and cost [1]. The main purpose of this study is to present a new method to generate a wide range of mappings with all reasonable values of communication cost. The most appropriate mapping is selected by total cost function using a linear function. The function can be customized by a designer, considering the impact of three key parameters, i.e., communication cost, robustness index and contention factor. The proposed procedure is shown in and explained in the next sections. Albeit the proposed approach is topology-independent, it is illustrated and evaluated for 2D mesh topology as it is widely used for most mapping algorithms.

21.2 Particle Swarm Optimization as a Mapping Generator Many mapping algorithms have been recently proposed to improve several parameters used in the NoC design. One of the most important parameters is the communication cost. There are several available mapping algorithms which are considered to minimize the communication cost. Using small hop counts between related cores will significantly drop the communication cost. Moreover, small hop counts will reduce the energy consumption and other performance metrics like latency [2]. It can be explained that reduction of hop counts can decrease the fault tolerant properties of NoC. Therefore, the optimal solution is to minimize the communication cost while maximizing the fault tolerant properties of NoC. In this paper, particle swarm optimization (PSO) algorithm is used to achieve the optimal solution. As a novel population-based swarm intelligent technique, PSO simulates the animal social behaviors such as birds flocking, fish schooling, etc. Due to the simple concept and ease implementation, it has gained much attention and many improvements have been proposed [3]. In a PSO system, multiple candidate solutions coexist and collaborate simultaneously. Each solution, called a ‘‘particle’’, flies in the problem space according to its own ‘‘experience’’ as well as the experience of neighboring particles. Different from other evolutionary computation algorithms, in PSO, each particle utilizes two information indexes: velocity and position, to search the problem space (Fig. 21.1).

21

Yield Enhancement with a Novel Method

249

Fig. 21.1 The proposed procedure to achieve the optimal application-specific Network-on-Chip

The velocity information predicts the next moving direction, as well as the position vector is used to detect the optimum area. In standard particle swarm optimization, the velocity vector is updated as follows:

250

A. R. Fekr et al.

Fig. 21.2 Particle swarm optimization algorithm

mjk ðt þ 1Þ ¼ wt mjk ðtÞ þ c1 r1 pjk ðtÞ Xjk ðtÞ þ c2 r2 pgk ðtÞ Xjk ðtÞ ;

ð21:1Þ

wtþ1 ¼ wt wdamp where mjk ðtÞ and xjk ðtÞ represent the kth coordinates of velocity and position vectors of particle j at time t, respectively. pjk ðtÞ means the kth dimensional value of the best position vector which particle j had been found, as well as pgk ðtÞ denotes the corresponding coordinate of the best position found by the whole swarm. Inertia weight, wt , cognitive coefficient, c1 ; and social coefficient, c2 , are three parameters controlling the size of velocity vector. r1 and r2 are two random numbers generated with normal distributions within interval [0,1]. With the corresponding velocity information, each particle flies according to the following rule (Eq. 21.2) [3]. This concept is shown in Fig. 21.2: xjk ðt þ 1Þ ¼ xjk ðtÞ þ mjk ðt þ 1Þ

ð21:2Þ

It is worth mentioning that onyx is one of the best mapping algorithms in terms of communication cost. Using Onyx result and considering the evolutionary nature of PSO, different mappings are created with a variety of communication costs. To do this, onyx result is injected into population initialization step as a particle as shown in Fig. 21.1b. In order to avoid rapid convergence, velocity threshold is not defined and c1 ; c2 ; w0 and wdamp are set to 3.49, 7.49, 1 and 0.99 respectively in the proposed PSO algorithm. These values were obtained by examining several simulations because they drastically affect on the diversity of results.

21.3 Experimental Results of Mapping Generator The real core graphs, VOPD and MPEG-4 [2], are used in the proposed PSO algorithm. The proposed PSO algorithm was run with 1000 initial population using 200 iterations. Figure 21.3a indicates the minimum, mean and maximum fitness function values in each iteration. As shown in Fig. 21.3b, it is clear that our PSO

21

Yield Enhancement with a Novel Method

251

Fig. 21.3 a Minimum, mean and maximum fitness function values for VOPD and MPEG-4 core graphs, b ability of the proposed mapping generator in producing mappings with all reasonable communication cost values

algorithm could generate different mappings of VOPD and MPEG-4 core graphs with all reasonable communication cost values because of mentioned convergence control. There are 119,912 and 156,055 different unique mappings for VOPD and MPEG-4 core graphs respectively. It is worth noting that this method, which is a novel approach, enables the designer to consider other important key parameters as well.

21.4 Robustness Index Robustness index is considered as a criterion for estimating fault tolerant properties of NoCs [4]. The greater the robustness index, the more fault tolerant NoC design. The robustness index RI, is based on the extension of the concept of path diversity [5]. For a given communication,ck 2 C; an NoC architecture graph, A(T, L), a mapping function, M, and a routing function, R, [4] defined the robustness index for communication ck ; RIðck Þ; as the average number of routing paths available for communication, ck , if a link belonging to the set of links used by communication ck is faulty. Formally, RIðck Þ ¼

1 X li;j2L qðck Þnqðck ; li;j Þ jLðck Þj

ð21:3Þ

252

A. R. Fekr et al.

where, qðck Þis the set of paths provided by R for communication, ck , qðck ; li;j Þ is the set of paths provided by R for communication, ck , that uses link li;j , and Lðck Þ is the set of links belonging to paths in qðck Þ. Suppose that there are two routing functions, A and B, which routing function A selects path1 and path2 and routing function B selects path2 and path3 to route packets between source and destination as shown in Fig. 21.1c. The routing function A selects two disjoint paths such that the presence of a faulty link in one path dose not compromise communication from source to destination since another path is fault-free. However, when the routing function B is used as shown in Fig. 21.1c, the communication will not occur. As the alternative paths share the link, l4 any fault in the link, l4 makes the communication from ‘‘source’’ to ‘‘destination’’ impossible. Consequently, the NoC which uses routing function A; NOC1 , is more robust than the NoC which uses routing function B, let call it NOC2 . Such situation is reflected by the robustness index. The robustness index for the above two cases are: RI ðNOC1 Þ ðSource ! destinationÞ ¼

1þ1þ1þ1þ1þ1 ¼ 1; 6

RI ðNOC2 Þ ðSource ! destinationÞ ¼

0þ1þ1þ1þ1 ¼ 0:8: 5

The NOC1 using path1 and path2 is more robust than the NOC2 using path2 and path3 for communication from ‘‘source’’to ‘‘destination’’as RI ðNOC1 Þ [ RI ðNOC2 Þ . The global robustness index, which characterizes the network, is calculated using the weighted sum of the robustness index of each communication. For a communication, ck , the weight of RIðck Þ is the degree of adaptivity [6] of ck . The degree of adaptivity of a communication, ck , is the ratio of the number of allowed minimal paths to the total number of possible minimal paths between the source node and the destination node associated to ck . The global robustness index is defined as Eq. 21.4. X aðck ÞRI ðNOCÞ ðck Þ ð21:4Þ RI ðNOCÞ ¼ ck 2C

where aðck Þ indicates the degree of adaptivity of communication ck . In this paper, one of the best algorithms which is customized for routing in application-specific NoCs, is used. The algorithm was presented in [7] which uses a highly adaptive deadlock-free routing algorithm. This routing algorithm has used Application-Specific Channel Dependency Graphs (ASCDG) concept to be freedom of dead-lock [8]. Removing cycles in ASCDG has great impact on parameters such as robustness index and is done by different methods. Therefore, in this paper, this step is skipped and left for the designer to use his preferable method.

21

Yield Enhancement with a Novel Method

253

21.5 Contention Factor In [9] a new contribution consist of an integer linear programming formulation of the contention-aware application mapping problem which aims at minimizing the inter-tile network contention was presented. This paper focuses on the network contention problem; this highly affects the latency, throughput and communication energy consumption. The source-based contention occurs when two traffic flows originating from the same source contend for the same links. The destination based contention occurs when two traffic flows which have the same destination contend for the same links. Finally the path-based contention occurs when two data flows which neither come from the same source, nor go towards the same destination contend for the same links somewhere in the network. The impact of these three types of contention was evaluated and observed that the path-based contention has the most significant impact on the packet latency. Figure 21.1d shows the path-based contention. So, in this paper we consider this type of contention as a factor of mappings. More formally: X Lðrmapðm Þ;mapðm Þ Þ \ Lðrmapðm Þ;mapðm Þ Þ Contention Factor ¼ i j k l 8ei;j 2E ð21:5Þ for i 6¼ k and j 6¼ l By having communication cost, robustness index and contention factor for each unique mapping, the best application-specific Network on Chip configuration should be chosen regarding to designer’s decisions.

21.6 Optimal Solution Using a Linear Function As previously mentioned, lower communication cost leads to an NoC with better metrics such as energy consumption and latency. Other introduced metrics were robustness index which is used as a measurable criterion for fault tolerant properties and contention factor which has the significant impact on the packet latency. A total cost function is to be introduced in order to minimize the sum of weighted these metrics (Fig. 21.1e). The total cost function is introduced as follows: d1 d2 d3 ðNOCÞ commcosti þ ðRIi Total Cost Function ¼ Min Þ þ CFI a b c 8 mappingi 2 generated mappings and di þ d2 þ d3 ¼ 1 ð21:6Þ

254

A. R. Fekr et al.

where, commcosti is the communication cost, RI ðNOCÞ is the robustness index and CFI is the contention factor of NoC after applying mappingi . The constants a, b and c are used to normalize the commcost, RI ðNOCÞ and CF. In this paper, a, b and c are set to the maximum obtained values for communication cost, robustness index and contention factor. d1 ; d2 and d3 are the weighting coefficients meant to balance the metrics. Although multi-objective evolutionary algorithms can be used to solve this problem, the proposed procedure is considered advantages in this study due to following reasons: First, a designer can change the weighting coefficients, without rerunning the algorithm. Second, due to the convergence control, the results are more diversified when compared to the multi-objective evolutionary algorithms and can be intensified by increasing the population size and/or iterations. And finally, if designer focuses on communication cost, the optimal communication cost does not usually occur in evolutionary algorithms.

21.7 Final Experimental Results In order to better investigate the capabilities of proposed procedure shown in Fig. 21.1, we have done some experiments on real core graphs VOPD and MPEG-4. As mentioned before, one of the advantages of proposed mapping generator is its diversity of produced solutions. Based on the experimental results, mentioned mapping generator produces 201,000 mappings for VOPD and MPEG-4, according to boundaries which limit population size and maximum iteration of PSO algorithm. Dismissing the duplicate mappings led to 119,912 and 156,055 unique mappings for VOPD and MPEG-4 which extracted among whole results. Results of running this procedure for VOPD and MPEG-4 core graphs and evaluating the values in the 3D design space are shown in Figs. 21.4, 21.5, 21.6, 21.7, 21.8, 21.9, 21.10, and 21.11. Values of d1 ; d2 and d3 which used in these experiments respectively are 0.5, 0.3 and 0.2 for VOPD core graph and 0.1, 0.2 and 0.7 for MPEG-4 core graph. As it can be seen in these figures, there are many different mappings which have the equal communication cost value that is one of the good points about proposed mapping generator. In average, there are almost 18 and 12 different mappings for each special value of communication cost while VOPD and MPEG-4 are considered as experimental core graphs. The optimal applicationspecific NoC configuration can be selected by setting proper values in total cost function based on designer demands. In our design, VOPD mapping with communication cost, 4347, robustness index, 54.28, and contention factor, 284, is the optimal solution. Mapping with communication cost, 6670.5, robustness index, 35.94, and contention factor, 6, is also the optimal solution for MPEG-4 mapping.

21

Yield Enhancement with a Novel Method

Fig. 21.4 Robustness index, contention factor and communication cost of VOPD mappings in 3D design space

Fig. 21.5 Communication cost, robustness index and total cost of VOPD mappings in 3D design space

Fig. 21.6 Communication cost, contention factor and total cost of VOPD mappings in 3D design space

Fig. 21.7 Robustness index, contention factor and total cost of VOPD mappings in 3D design space

255

256 Fig. 21.8 Robustness index, contention factor and communication cost of MPEG-4 mappings in 3D design space

Fig. 21.9 Communication cost, robustness index and total cost of MPEG-4 mappings in 3D design space

Fig. 21.10 Communication cost, contention factor and total cost of MPEG-4 mappings in 3D design space

Fig. 21.11 Robustness index, contention factor and total cost of MPEG-4 mappings in 3D design space

A. R. Fekr et al.

21

Yield Enhancement with a Novel Method

257

21.8 Conclusion As mapping is the most important step in Network-on-Chip design, in this paper a new mapping generator using Particle Swarm Optimization algorithm was presented. The best mapping in terms of communication cost was derived from Onyx mapping algorithm and injected into population initialization step as a particle. Because of using Onyx mapping results as particles, results convergence was controlled by finding appropriate values in velocity vector. This PSO algorithm is able to generate different mappings with all reasonable communication cost values. Using three metrics which are communication cost, robustness index and contention factor for each unique mapping, the best application-specific Networkon-Chip configuration can be selected regarding to designer’s demands that are applied onto the total cost function. Acknowledgments This chapter is an extended version of the paper [10] published at the proceedings of The World Congress on Engineering 2010, WCE 2010, London, UK.

References 1. Shen W, Chao C, Lien Y, Wu A (2007), A new binomial mapping and optimization algorithm for reduced-complexity mesh-based on-chip network. Networks-on-chip, NOCS, 7–9 May 2007, pp 317–322 2. Janidarmian M, Khademzadeh A, Tavanpour M (2009) Onyx: a new heuristic bandwidthconstrained mapping of cores onto tile based Network on Chip. IEICE Electron Express 6(1):1–72 3. Zhihua CUI, Xingjuan CAI, Jianchao ZENG (2009) Choatic performance-dependant particle swarm optimization. Int J Innov Comput Inf Control 5(4):951–960 4. Tornero R, Sterrantino V, Palesi M, Orduna JM (2009) A multi-objective strategy for concurrent mapping and routing in networks on chip. In: Proceedings of the 2009 IEEE international symposium on parallel & distributed processing, pp 1–8 5. Dally WJ, Towles B (2004) Principle and practice of interconnection network. Morgan Kaufmann, San Francisco 6. Glass CJ, Ni LM (1994) The turn model for adaptive routing. J Assoc Comput Mach 41(5):874–902 7. Palesi M, Longo G, Signorino S, Holsmark R, Kumar S, Catania V (2008) Design of bandwidth aware and congestion avoiding efficient routing algorithms for networks-on-chip platforms. In: Second ACM/IEEE international symposium on networks-on-chip, NoCS 2008, pp 97–106 8. Palesi M, Holsmark R, Kumar S (2006) A methodology for design of application specific deadlock-free routing algorithms for NoC systems, hardware/software codesign and system synthesis. CODES ? ISSS ‘06. In: Proceedings of the 4th international conference, pp 142–147 9. Chou C, Marculescu R (2009) Contention-aware application mapping for Network-on-Chip communication architectures computer design, 2008. IEEE international conference on ICCD 2008, vol 19, pp 164–169 10. Roshan Fekr A, Khademzadeh A, Janidarmian M, Samadi Bokharaei V (2010) Bandwidth/ fault tolerance/contention aware application-specific NoC using PSO as a mapping generator. In: Lecture notes in engineering and computer science: proceedings of the world congress on engineering 2010, WCE 2010, 30 June–2 July 2010, London, UK, pp 247–252

Chapter 22

On-Line Image Search Application Using Fast and Robust Color Indexing and Multi-Thread Processing Wichian Premchaisawadi and Anucha Tungkatsathan

Abstract The keyword-based images search engine like Google or Yahoo may return a large number of junk images which are irrelevant to the given keywordbased queries. In this paper, an interactive approach is developed to filter out the junk images from the keyword-based Yahoo image search results through Yahoo’ Boss API. The framework of multi-threaded processing is proposed to incorporate an image analysis algorithm into the text-based image search engines. It enhances the capability of an application when downloading images, indexing, and comparing the similarity of retrieved images from diverse sources. We also propose an efficient color descriptor technique for image feature extraction, namely, Auto Color Correlogram and Correlation (ACCC) to improve the efficiency of image retrieval system and reduce the processing time. The experimental evaluations based on the coverage ratio measure show that our scheme significantly improves the retrieval performance over the existing image search engines.

22.1 Introduction Most of the popular, commercial search engines, such as Google, Yahoo, and even the latest application, namely Bing, introduced by Microsoft, have achieved great success on exploiting the pure keyword features for the retrieval process of large-

W. Premchaisawadi (&) A. Tungkatsathan Graduate School of Information Technology in Business, Siam University, 38 Petkasem Rd., Phasi-Charoen, Bangkok, Thailand e-mail: [email protected] A. Tungkatsathan e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_22, Ó Springer Science+Business Media B.V. 2011

259

260

W. Premchaisawadi and A. Tungkatsathan

scale online image collections. Unfortunately, these image search engines are still unsatisfactory because of the relatively low precision rate and the appearance of large amounts of junk images [1]. One of several main reasons is that these engines don’t use visual signature of the image for image indexing and retrieval. The indexing images, retrieval processes, and similarity measure among images principally take a long computation time so that they aren’t suitable for real-time process optimization approach. There are many researchers who are trying to minimize computation time by applying distributed computing, for instance cluster computing to reduce the computational time [2–7]. Lu et al. presented a parallel technique to perform feature extraction and a similarity comparison of visual features, developed on cluster architecture. The experiments conducted show that a parallel computing technique can be applied that will significantly improve the performance of a retrieval system [2]. Kao, et al. proposed a cluster platform, which supports the implementation of retrieval approaches used in CBIR systems. Their paper introduces the basic principles of image retrieval with dynamic feature extraction using cluster platform architecture. The main focus is workload balancing across the cluster with a scheduling heuristic and execution performance measurements of the implemented prototype [3]. Ling and Ouyang proposed a parallel algorithm for semantic concept mapping, which adopts two-stage concept searching method. It increases the speed of computing the low-level feature extraction, latent semantic concept model searching and bridging relationship between image low-level feature and global sharable ontology [4]. Kao presents techniques for parallel multimedia retrieval by considering an image database as an example. The main idea is a distribution of the image data over a large number of nodes enables a parallel processing of the compute intensive operations for dynamic image retrieval. However, it is still a partitioning of the data and the applied strategies for workload balancing [5]. Although, cluster computing is popularly used in images retrieval approaches, it only attacks this problem at the macro level. Especially, to design a distributed algorithm and program it with cross-platform capability is difficult. In contrast, this paper is concerned with the micro level aspect of the problem by using multi-threading. Multi-threading is not the same as distributed processing. Distributed processing which is sometimes called parallel processing and multi-threading are both techniques used to achieve parallelism (and can be used in combination) [8]. Fortunately, with the increasing computational power of modern computers, some of the most time-consuming tasks in image indexing and retrieval are easily parallelized, so that the multi-core architecture in modern CPUs and multithreaded processing may be exploited to speed up image processing tasks. Moreover, it is possible to incorporate an image analysis algorithm into the textbased image search engines such as Google, Yahoo, and Bing without degrading their response time significantly [9]. We also presents modify advanced algorithm, namely auto color correlogram and correlation (ACCC) [10] based on a color correlogram (CC) [11], for extracting and indexing low-level features of images. The framework of multi-threaded processing for an on-line CBIR application is

22

On-Line Image Search Application

261

proposed. It enhances the capability of an application when downloading images and comparing the similarity of retrieved images from diverse sources. Section 22.2 presents the framework of an on-line image retrieval system with multithreading. Section 22.3 discusses the proposed indexing technique in older to speed up image processing tasks. The experimental study is presented in Sect. 22.4 and concluding remarks are set out in Sect. 22.5.

22.2 The Proposed Framework of Multithreading for an On-Line CBIR System Before introducing our framework of multi-threading for an on-line CBIR application, we will briefly examine the properties of the queries to be answered. We have developed a novel framework of real-time processing for an on-line CBIR application, using relevance images from Yahoo images. Our method uses the following major steps: (a) Yahoo Images is first used to obtain a large number of images that are returned for a given text-based query; (b) The users select a relevance image and a user’s feedback is automatically collected to update the input query for image similarity characterization; (c) A multi-threaded processing method is used to manage and perform data parallelism or loop-level parallelism such as downloading images, extraction of visual features and computation of visual similarity measures; (d) If necessary, users can also change a keyword before selecting a relevance image for the query; (e) The updated queries are further used to adaptively create a new answer for the next set of returned images according to the users’ personal preferences (see Fig. 22.1). In this section, a multi-threaded processing method is used to carry out parallel processing of multiple threads for a specific purpose. Multi-threading is a way to let programs do more than one thing at a time, implemented within a single program, and running on a single system. The number of threads should be considered and they must technically be assigned to the correct parts of the program in order to utilize the threads more efficiently. The development of functions, classes, and objects in the program should logically be designed as a sequence of steps. In this research, we firstly use the threads to improve the downloading speed for images from various sources according to the locations specified in the .xml file that are returned from Yahoo BOSS API [12]. Second, they increase the speed of computing the image feature extraction and similarity measure of feature vectors. The framework of multi-thread processing is presented in Fig. 22.2. The thread control and the tasks insight of a thread for retrieving images are presented in Figs. 22.3 and 22.4. An image list control receives the .xml files that are returned from Yahoo BOSS API. The lists of URL can be obtained from the .xml files. They are further displayed and used for downloading images from the hosts. An image download module is designed to work in a multithreaded process for downloading images from diverse sources. It is controlled by an image search control module. The image search control module performs a very important function in the

262

W. Premchaisawadi and A. Tungkatsathan

Enter keyword

Image from Yahoo database

Image query

“Apple”

Similar Image

1

2

3

4

5

6

7

8

Fig. 22.1 Basic principles of the proposed system

management of the system. It fully supports and controls all modules of the online CBIR system. It checks for errors, and the input/output status of each module. Most importantly, it efficiently supports the synchronization of multiple threads that performs image download and similarity measurement by the associated modules. The similarity measurement module performs the computation of the feature vectors and distance metrics of all images that are obtained from the image download module. The image download and similarity measurement modules work concurrently. The query results are recorded into a session of an array in sequential order. The image list object is responsible for the arrangement of all displayed images on the application.

22.3 Feature Computation This paper’s main focus is on parallel computing techniques for image retrieval. The main objective is to reduce the processing time of real-time a CBIR system. However, an efficient color descriptor technique for image feature extraction is still required to reduce the processing time. In this section, we present an efficient algorithm for the proposed framework. It is a modifying of the correlogram technique for color indexing. An auto color correlation (ACC) [10] expresses how to compute the mean color of all pixels of color Cj at a distance kth from a pixel of color Cj in the image. Formally, the ACC of imagefI ðx; yÞ; x ¼ 1; 2; . . .; M; y ¼ 1; 2; . . .; Ng is defined by Eq. 1.

22

On-Line Image Search Application

263 CBRI application

IMAGES LIST Image Result [XML Format]

Parser XML Image LIST

KEYWORD SEARCH

URL List of source Image

KEYWORD

BOSS API YAHOO!

KEYWORD

Images

KEYWORD

IMAGE SEARCH CONTROL

Files Name

End Status

Message Download Synchronous

Image Result [XMLFormat] Images List

FEATURE EXTRACTION AND SIMILARITY MEASUREMENT

IMAGES DOWNLOAD Thread Control

Thread Control

Thread Download

Thread Download

Thread Download

Thread Download

Thread Download

Thread Feature Extraction and Comparasion

Thread Feature Extraction and Comparasion

Thread Feature Extraction and Comparasion

SIMILARITY RESULT ARRAY Sorting Module

SIMILARITY IMAGES LIST

Result List

Separate Result to Pages Image List from Link original’s image

Fig. 22.2 The framework of in real-time multi-threaded processing for an on-line CBIR application

ACCði; j; kÞ ¼ MCj cðkÞ ci cj ðIÞ n o ðkÞ ðkÞ ¼ rmcj cðkÞ ci cj ðIÞ; gmcj cci cj ðIÞ; bmcj cci cj ðIÞjci 6¼ cj

ð22:1Þ

where the original image I ðx; yÞ is quantized to m colors C1 ; C2 ; . . .; Cm and the distance between two pixels d 2 ½minfM; N g is fixed a priori. Let MCj is the color mean of the total number of color Ci from color Ci at distance kth in an image I. The arithmetic mean colors are computed by Eq. 22.2.

264

W. Premchaisawadi and A. Tungkatsathan

Fig. 22.3 The thread control and the tasks insight of the thread for downloading images

Wait/Sleep/ Join/Timeout Thread Blocks

Thread Unblocks Abort

Running Download Image

Start Unstarted

Abort Image Preparation

Abort Requested Reset Abort

Save Image

Thread Ends

Thread Ends Stopped

Fig. 22.4 The thread control and the tasks insight of the thread for retrieving images

22

On-Line Image Search Application

265

rmcj cðkÞ ci cj ðIÞ

¼

gmcj cðkÞ ci cj ðIÞ ¼

CðkÞ ci ;rcj ðIÞ CðkÞ ci ;cj ðIÞ CðkÞ ci ;gcj ðIÞ CðkÞ ci ;cj ðIÞ

jci 6¼ cj jci 6¼ cj

ð22:2Þ

ðkÞ

bmcj cðkÞ ci cj ðIÞ ¼

Cci ;bcj ðIÞ CðkÞ ci ;cj ðIÞ

jci 6¼ cj

The denominator CðkÞ ci ;xcj ðIÞ is the total of pixels color values of color Cj at distance k from any pixel of color Ci when xCj is RGB color space of color Cj and denoted Cj 6¼ 0: N is the number of accounting color Cj from color Ci at distance k, defined by Eq. 22.3. ( N ¼ Ckci ;cj ðIÞ ¼

Pðx1 ; y1 Þ 2 Ci jPðx2 ; y2 Þ 2 Cj ; k ¼ minfjx1 x2 j; jy1 y2 jg

) ð22:3Þ

We propose an extended technique of ACC based on the autocorrelogram, namely Auto Color Correlogram and Correlation (ACCC). It is the integration of Autocorrelogram [5] and Auto Color Correlation techniques [10]. However, the size of ACCC is still O(md). The Auto Color Correlogram and Correlation is defined by Eq. 22.4. n o ðkÞ ðIÞ; MC c ðIÞ ACCCðj; j; kÞ ¼ cðkÞ j ci cj ci

ð22:4Þ

Let the ACCC pairs for the m color bin be ðai ; bi Þ in I and ða0i ; b0i Þ in I0 . The similarity of the images is measured as the distance between the AC’s and ACC’s dðI; I 0 Þ, which are derived from Lee et al. [13]. It is shown by Eq. 22.5 ( 0

dðI; I Þ ¼

k1

X 8i

) ai a0 X bi b0i i þ k2 0:1 þ ai þ a0i 0:1 þ bi þ b0i 8i

ð22:5Þ

The k1 and k2 are the similarity weighting constants of autocorrelogram and auto color correlation, respectively. In the experiments conducted, k1 ¼ 0:5 and a1 and a2 are defined by Eq. 22.6. The detail of ACC and ACCC algorithms are presented in Tungkastsathan and Premchaisawadi [10]. ai ¼ cðkÞ c ðIÞ ni o ðkÞ ðkÞ bi ¼ rmcj cðkÞ ci cj ðIÞ; gmcj cci cj ðIÞ; bmcj cci cj ðIÞjci 6¼ cj

ð22:6Þ

266

W. Premchaisawadi and A. Tungkatsathan

22.4 Experiment and Evaluation The experiments that were performed are divided into two groups: In group 1, we evaluated the retrieval rate for on-line Yahoo image data sets in term of user relevance. And in group 3, we studied the performance of multi-thread processing in term of data parallelism for real-time image retrieval tasks.

22.4.1 Evaluated the Retrieval Rate We have implemented an on-line image retrieval system using the Yahoo image database based on the Yahoo BOSS’ API. The application is developed by using Microsoft .NET and implemented in the Windows NT environment. The goal of this experiment is to show that relevant images can be found after a small number of iterations, the first round is used in this experiment. From the viewpoint of user interface design, precision and recall measures are less appropriate for assessing an interactive system [14]. To evaluate the performance of the system in terms of user feedback, user-orientation measures are used. There have been other design factors proposed such as relative recall, recall effort, coverage ratio, and novelty ratio [15]. In this experiment the coverage ratio measure is used. Let R be the set of relevant images of query q and A be the answer set retrieved. Let jU j be the number of relevant images which are known to the user, where U 2 R. The coverage ratio is the intersection of the set A and U, jRk j be the number of images in this set. It is defined by Eq. 22.7. CoverageðCq Þ ¼

jRk j U

ð22:7Þ

Let WðqÞ is the number of keyword used. The average of coverage ratio is by Eq. 22.8. NðqÞ

CðqÞ

1 X jRk j ¼ NðqÞ i¼1 jUj

ð22:8Þ

To conduct this experiment, Yahoo Images is first executed to obtain a large number of images returned by a given text-based query. The user selects a relevant image, specific to only one interaction with the user. Those images that are most similar to the new query image are returned. The retrieval performance in term of coverage ratio of the proposed system is compared to the traditional Yahoo textbased search results. The average coverage ratio is generated based on the ACC and ACCC algorithms using over 49 random test keywords in heterogeneous categories (i.e. animal, fruit, sunset, nature, and landscape). The results are presented in Table 22.1.

22

On-Line Image Search Application

267

Table 22.1 Coverage ratio average of the top 24 of 200 retrieved images Sample images Coverage ratio Sample 1 Sample 2 Avg. Text-based

Animal

Fruit

Sunset/sunrise

Nature

Landscape

0.71 0.65 0.68 0.42

0.79 0.71 0.75 0.32

0.62 0.65 0.63 0.58

0.64 0.59 0.62 0.36

0.69 0.65 0.67 0.43

The data in a Table 22.1 shows that a user’s feedback using a keyword with the ACCC algorithm can increase the efficiency of image retrieval from the Yahoo image database. Using the combination of text and a user’s feedback for an image search, the images that do not correspond with the category are filtered out. It also decreases the opportunity of the images in other categories to be retrieved. In the experiment, we used two sample images obtained from the keyword search to test querying images for evaluating the performance of the system. The screenshots of the online image search application are shown in Figs. 22.5 and 22.6, respectively.

Fig. 22.5 Query results using a keyword search

268

W. Premchaisawadi and A. Tungkatsathan

Fig. 22.6 Query results after applying a relevant feedback

22.4.2 Performance of Multithreading in the Image Retrieval Tasks In the experimental settings, we used one keyword for downloading two hundred images and performed the image search in the same environment (internet speed, time for testing, hardware and software platforms). We tested the application by using 49 keywords in heterogeneous categories (i.e. animal, fruit, sunset, nature, and landscape). We tested the image search for three times in each keyword and calculated the average processing time of the whole process for an on-line image retrieval task. The number of downloaded images for each keyword had a maximum error value, which was less than ten percent of total downloaded images. The threads were tested and run on two different hardware platform specifications, single-core and multi-core CPUs. The hardware specifications are described as follows. (1) Pentium IV singlecore 1.8 GHz, and 1 GB RAM DDR2 system, (2) Quad-Core Intel Xeon processor E5310 1.60 GHz, 1066 MHz FSB 1 GB (2 9 512 MB) PC2-5300 DDR2. The number of threads versus time on single-core and multi-core CPUs for an image retrieval process that includes image downloading, feature extraction and image comparison, which are shown in our previous work [16]. We can conclude that the processing time for the same amount of threads in each platform for an image retrieval task is different (see in Figs. 22.7 and 22.8). However, we selected the most suitable number of threads from the tests on each platform to determine the assumptions underlying a hypothesis test. The results are shown in Table 22.2.

22

On-Line Image Search Application

269

Serialized

5 threads

25 threads

5 threads

10 threads

700

600

Time(Sec.)

500

400

300

200

100

0

1

4

7

10

13

16

19

22

25

28

31

34

37

40

43

46

49

Number of Queries

Fig. 22.7 Number of treads versus time on multi-core in all processes for online CBIR system [16]

Serialized

5 threads

25 threads

50 threads

10 threads

601

Time(Sec.)

501 401 301 201 101 1

1

4

7

10

13

16

19

22

25

28

31

34

37

40

43

46

49

Number of Queries

Fig. 22.8 Number of threads versus time on multi-core in all processes for online CBIR system [16]

270

W. Premchaisawadi and A. Tungkatsathan

Table 22.2 The average time in second of a whole process, image downloading, feature extraction, and image comparison at suitable number of threads in each platform (mean ± stddev) W(q) S-core 10 threads Q-core 50 threads W(q) S-core 10 threads Q-core 50 threads 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

159.6 ± 3.3 165.6 ± 13.5 165.7 ± 15.1 148.7 ± 18.9 155.6 ± 4.9 150.0 ± 13.1 159.3 ± 8.5 158.0 ± 21.4 148.3 ± 17.6 160.0 ± 21.9 158.3 ± 5.7 162.7 ± 10.9 155.3 ± 3.4 157.7 ± 3.1 149.3 ± 4.5 152.0 ± 2.3 167.7 ± 18.6 174.7 ± 15.1 171.3 ± 7.6 162.0 ± 10.0 163.7 ± 3.7 162.3 ± 8.3 162.0 ± 4.2 156.3 ± 16.0 154.7 ± 8.9

57.0 ± 5.7 57.3 ± 6.9 65.7 ± 3.9 70.0 ± 6.2 75.3 ± 7.1 53.3 ± 1.7 60.7 ± 1.2 65.7 ± 2.5 61.3 ± 5.4 67.0 ± 3.6 71.0 ± 2.2 66.3 ± 4.0 63.7 ± 1.2 62.0 ± 1.2 58.7 ± 7.5 61.3 ± 2.5 66.0 ± 7.8 66.0 ± 4.9 58.0 ± 4.1 71.0 ± 9.2 59.3 ± 4.5 58.0 ± 7.1 56.0 ± 5.1 72.3 ± 10.2 64.3 ± 11.6

26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 Avg

153.6 ± 9.2 156.6 ± 4.9 157.6 ± 13.1 159.0 ± 11.4 161.3 ± 11.8 160.3 ± 7.6 165.6 ± 16.9 153.0 ± 7.5 153.3 ± 12.6 150.0 ± 10.7 158.0 ± 11.3 159.6 ± 11.4 154.0 ± 5.9 157.0 ± 13.4 156.6 ± 8.9 165.0 ± 11.1 164.3 ± 9.7 149.3 ± 11.1 138.3 ± 6.2 148.6 ± 4.0 150.0 ± 9.9 146.0 ± 7.5 145.0 ± 8.5 150.0 ± 10.4 157.1 ± 8.4

61.0 ± 5.1 64.3 ± 5.8 68.3 ± 3.9 66.0 ± 7.8 56.7 ± 1.7 51.3 ± 4.5 63.3 ± 4.2 69.3 ± 3.9 59.0 ± 4.5 63.6 ± 6.0 63.7 ± 2.1 60.7 ± 6.8 61.0 ± 2.9 65.7 ± 2.5 62.3 ± 4.2 53.7 ± 4.8 56.7 ± 2.6 56.3 ± 3.1 64.3 ± 7.8 58.0 ± 0.8 57.7 ± 6.8 57.0 ± 4.3 60.7 ± 5.7 54.0 ± 3.3 62.0 ± 7.3

We formulated the hypothesis based on the experiment by using the statistical ttest. We did a t-test on the 49 keywords for retrieving images in order to measure the significance of the complete processing time obtained after applying our proposed scheme (see in Table 22.2). The mean processing times of single-core and multi-core platforms are 157.12 ± 8.4 and 62.0 ± 7.3, respectively. Using the t-test to compare the means of two independent CPU platform specifications, the P values obtained from the t-test of single-core versus multi-core is 1.98e-25. A statistical test shows that a multi-core platform significantly consumes less processing time than that of the single-core platform.

22.5 Conclusions This research presents an interactive approach to filter out the junk images from the keyword-based Yahoo image search results. The advanced spatial color descriptors, namely; auto color correlation (ACC) and auto color correlogram

22

On-Line Image Search Application

271

and correlation (ACCC), are proposed. In order for the processing time of feature computation to be reduced, the multi-threaded processing method is also proposed. The coverage ratio measure is used to evaluate the retrieval performance of the user’s relevance feedback. Experiments on diverse keyword-based queries from Yahoo Images search engine obtained very positive results. Additionally, the experimental results show that our proposed scheme can speed up of the processing time for feature extraction and image similarity measurement as well as images downloading from various hosts. The use of multiple threads can significantly improve the performance of image indexing and retrieval on both platforms. In the future work based on this study, the distributed processing and multithreading will be considered in combination to achieve the parallelism.

References 1. Yuli G, Jinye P, Hangzai L, Keim DA, Jianping F (2009) An interactive approach for filtering out junk images from keyword based Google search results. IEEE Trans Circuits Syst Video Technol 19(12):1–15 2. Lu Y, Gao P, Lv R, Su Z, Yu W (2007) Study of content-based image retrieval using parallel computing technique. In: Proceedings of the 2007 Asian technology information program’s (ATIP’s), 11 November–16 November 2007, China, pp 186–191 3. Kao O, Steinert G, Drews F (2001) Scheduling aspects for image retrieval in cluster-based image databases. In: Proceedings of first IEEE/ACM. Cluster computing and the grid, 15 May–18 May 2001, Brisbane, Australia, pp 329–336 4. Ling Y, Ouyang Y (2008) Image semantic information retrieval based on parallel computing. In: Proceeding of international colloquium on computing, communication, control, and management, CCCM, 3 August–4 August 2008, vol 1, pp 255–259 5. Kao O (2001) Parallel and distributed methods for image retrieval with dynamic feature extraction on cluster architectures. In: Proceedings of 12th international workshop on database and expert systems applications, Munich, Germany, 3 September 2001–7 September 2001, pp 110–114 6. Pengdong G, Yongquan L, Chu Q, Nan L, Wenhua Y, Rui L (2008) Performance comparison between color and spatial segmentation for image retrieval and its parallel system implementation. In: Proceedings of the international symposium on computer science and computational technology, ISCSCT 2008, 20 December–22 December 2008, Shanghai, China, pp 539–543 7. Town C, Harrison K (2010) Large-scale grid computing for content-based image retrieval. Aslib Proc 62(4/5):438–446 8. Multi-threading in IDL. http://www.ittvis.com/ 9. Gao Y, Fan J, Luo H, Satoh S (2008) A novel approach for filtering junk images from Google search results. In: Lecture notes in computer science: advances in multimedia modeling, vol 4903, pp 1–12 10. Tungkastsathan A, Premchaisawadi W (2009) Spatial color indexing using ACC algorithms. In: Proceeding of the international conference on ICT and knowledge engineering, 1 December–2 December 2009, Bangkok, Thailand, pp 113–117 11. Huang J, Kumar SR, Mitra M, Zhu W-J (1998) Spatial color indexing and applications. In: Proceeding of sixth international conference on computer vision, 4 January–7 January 1998, Bombay, India, pp 606–607 12. Yahoo BOSS API. http://developer.yahoo.com/search/boss/

272

W. Premchaisawadi and A. Tungkatsathan

13. Lee HY, Lee HK, Ha HY, Senior member, IEEE (2003) Spatial color descriptor for image retrieval and video segmentation. IEEE Trans Multimed 5(3):358–367 14. Ricardo B-Y, Berthier R-N (1999) Modern information retrieval. ACM Press Book, New York 15. Robert RK (1993) Information storage and retrieval. Wiley, New York 16. Premchaisawadi W, Tungkatsathan A (2010) Micro level attacks in real-time image processing for an on-line CBIR system. In: Lecture notes in engineering and computer science: proceedings of the world congress on engineering 2010, WCE 2010, 30 June–2 July 2010, London, UK, pp 182–186

Chapter 23

Topological Mapping Using Vision and a Sparse Distributed Memory Mateus Mendes, A. Paulo Coimbra and Manuel M. Crisóstomo

Abstract Navigation based on visual memories is very common among humans. However, planning long trips requires a more sophisticated representation of the environment, such as a topological map, where connections between paths are easily noted. The present approach is a system that learns paths by storing sequences of images and image information in a sparse distributed memory (SDM). Connections between paths are detected by exploring similarities in the images, using the same SDM, and a topological representation of the paths is created. The robot is then able to plan paths and switch from one path to another at the connection points. The system was tested under reconstitutions of country and urban environments, and it was able to successfully map, plan paths and navigate autonomously.

23.1 Introduction About 80% of all the information humans rely on is visual [4], and the brain operates mostly with sequences of images [5]. View sequence based navigation is also extremely attractive for autonomous robots, for the hardware is very M. Mendes (&) ESTGOH, Polytechnic Institute of Coimbra, R. General Santos Costa, 3400-124 Oliveira do Hospital, Portugal e-mail: [email protected] M. Mendes A. P. Coimbra M. M. Crisóstomo Institute of Systems and Robotics, Pólo II, University of Coimbra, 3000, Coimbra, Portugal e-mail: [email protected] M. M. Crisóstomo e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_23, Ó Springer Science+Business Media B.V. 2011

273

274

M. Mendes et al.

straightforward, and the approach is biologically plausible. However, while humans are able to navigate quite well based only on visual information, images usually require huge computer processing power. This means that for real time robot operation, visual information is often avoided. Other sensors, such as sonar or laser range finders, provide accurate information at a much lower computational cost. The goal of equipping robots with cameras and vision-based navigation is still an open research issue. The use of special landmarks (possibly artificial, such as barcodes or data matrices) is a trick that can greatly improve the accuracy of the system [13]. As for the images, there are two popular approaches: one that uses plain images [6], the other that uses panoramic images [8]. Panoramic images offer a 360° view, which is richer than a plain front or rear view. However, that richness comes at the cost of even additional processing power requirements. Besides, the process of acquiring panoramic images requires the use of parabolic mirrors, which also introduce some distortion in the images. Some authors have also proposed techniques to speed up processing and/or reduce memory needs. Matsumoto [7] uses images as small as 32 9 32 pixels. Ishiguro [2] replaced the images by their Fourier transforms. Winters [15] compresses the images using Principal Component Analysis. All those techniques improve the processing time and/or efficiency of image processing in real time, contributing to make robot navigation based on image processing more plausible. The images alone are a means for instantaneous localisation. View-based navigation is almost always based on the same idea: during a learning stage the robot learns a sequence of views and motor commands that, if followed with minimum drift, will lead it to a target location. By following the sequence of commands, possibly correcting the small drifts that may occur, the robot is later able to follow the learnt path on its own. The idea is very simple and it works very well for single paths. However, it is not versatile and requires that all the paths are taught one by one. For complex trips and environments, that may be very time consuming. The process can be greatly simplified using topological maps and path planning algorithms. To plan paths efficiently, switching from one path to another at connection nodes, when necessary, more sophisticated representations of the environment are required than just plain images of sampling points. Those representations are provided by metric or topological maps [12]. Those maps represent paths and connections between them. They are suitable to use with search algorithms such as A*, for implementing intelligent planning and robot navigation. This paper explains how vision-based navigation is achieved using a sparse distributed memory (SDM) to store sequences of images. The memory is also used to recognise overlaps of the paths and thus establish connection nodes where the robot can switch from one path to another. That way, a topological representation of the world can be constructed, and the system can plan paths. Part of this work has already been published in [9]. Section 23.2 explains navigation based on view sequences in more detail. Section 23.3 explains how the SDM works. In Sect. 23.4 the robot platform used for the experiments is described. Section 23.5 describes the navigation algorithm, and Sect. 23.6 shows and discusses the results obtained.

23

Topological Mapping Using Vision

275

23.2 Navigation Using View Sequences Usually, the vision-based approaches for robot navigation are based on the concept of a ‘‘view-sequence’’ and a look-up table of motor commands, where each view is associated with a corresponding motor command that leads the robot towards the next view in the sequence. In the present work, the approach followed is very similar to that of Matsumoto et al. [7]. That approach requires a learning stage, during which the robot must be manually guided. While being guided, the robot memorises a sequence of views automatically. While autonomously running, the robot performs automatic image based localisation and obstacle detection, taking action in real-time. Localisation is estimated based on the similarity of two views: one stored during the learning stage and another grabbed in real-time. The robot tries to find matching areas between those two images, and calculates the horizontal distance between them in order to infer how far it is from the correct path. That distance is then used to correct possible drifts to the left or to the right. The technique is described in more detail in [10].

23.3 Sparse Distributed Memories The sparse distributed memory is an associative memory model proposed by Kanerva in the 1980s [5]. It is suitable to work with high dimensional binary vectors. In the present work, an image can be regarded as a high-dimensional vector, and the SDM can be used simultaneously as a sophisticated storage and retrieval mechanism and a pattern-matching tool.

23.3.1 The Original Model The underlying idea behind the SDM is the mapping of a huge binary memory onto a smaller set of physical locations, called hard locations. As a general guideline, those hard locations should be uniformely distributed in the virtual space, to mimic the existence of the larger virtual space as accurately as possible. Every datum is stored by distribution to a set of hard locations, and retrieved by averaging those locations and comparing the result to a given threshold. Figure 23.1 shows a model of a SDM. ‘‘Address’’ is the reference address where the datum is to be stored or read from. It will activate all the hard locations within a given access radius, which is predefined. Kanerva proposes that the Hamming distance, that is the number of bits in which two binary vectors are different, be used as the measure of distance between the addresses. All the locations that differ less than a predefined number of bits from the input address are selected for the read or write operation. In the figure,

276

M. Mendes et al.

Fig. 23.1 One model of a SDM, using bit counters

the first and the third locations are selected. They dist, respectively, 2 and 3 bits from the input address, and the activation radius is exactly 3 bits. Data are stored in arrays of counters, one counter for every bit of every location. Writing is done by incrementing or decrementing the bit counters at the selected addresses. To store 0 at a given position, the corresponding counter is decremented. To store 1, it is incremented. Reading is done by averaging the values of all the counters columnwise and thresholding at a predefined value. If the value of the sum is below the threshold, the bit is zero, otherwise it is one. Initially, all the bit counters must be set to zero, for the memory stores no data. The bits of the address locations should be set randomly, so that the addresses would be uniformely distributed in the addressing space. There is no guarantee that the data retrieved is exactly the same that was written. It should be, providing that the hard locations are correctly distributed over the binary space and the memory has not reached saturation.

23.3.2 The Model Used The original SDM model, though theoretically sound and attractive, has some faults. One problem is that of selecting the hard locations at random in the beginning of the operation. Another problem is that of using bit counters, which cause a very low storage rate of about 0.1 bits per bit of traditional computer memory and slow down the system. Those problems have been thoroughly described in [11], where the authors study alternative architectures and methods of encoding the data. To overcome the problem of placing hard locations in the address space, in the present work the hard locations are selected using the Randomised Reallocation algorithm proposed by Ratitch and Precup [14]. The idea is that the system starts with an empty memory and allocates new hard locations when there is a new datum which cannot be stored in enough existing locations. The new locations are placed randomly in the neighbourhood of the new datum address. To overcome the problem of using bit counters, the bits are grouped as integers, as shown in Fig. 23.2. Addressing is done using an arithmetic distance, instead of the

23

Topological Mapping Using Vision

277

Fig. 23.2 Alternative architecture of the SDM, auto-associative and using integer numbers

Fig. 23.3 Robot used

Hamming distance. Learning is achieved through the use of a gradient descent approach, updating each byte value using the equation: hkt ¼ hkt1 þ a ðxk hkt1 Þ;

a 2 R ^ 0a1

ð23:1Þ

The value hkt is the kth integer number in the hard location h; at time t: The value xk is the corresponding kth integer number in the input vector x: The coefficient a is the learning rate—in this case it was set to 1, enforcing one shot learning.

23.4 Experimental Platform The robot used was a Surveyor SRV-1, a small robot with tank-style treads and differential drive via two precision DC gearmotors (Fig. 23.3). Among other features, it has a built in digital video camera and a 802.15.4 radio communication module. This robot was controlled in real time from a laptop with a 1.8 GHz processor and 1 Gb RAM. The overall software architecture is as shown in Fig. 23.4. It contains three basic modules: 1. The SDM, where the information is stored. 2. The Focus (following Kanerva’s terminology), where the navigation algorithms are run.

278

M. Mendes et al.

Fig. 23.4 Architecture of the implemented software

3. An operational layer, responsible for interfacing the hardware and some tasks such as motor control, collision avoidance and image equalisation. Navigation is based on vision, and has two modes: supervised learning, in which the robot is manually guided and captures images to store for future reference; and autonomous running, in which it uses previous knowledge to navigate autonomously, following any sequence previously learnt. The vectors stored in the SDM consist of arrays of bytes, as summarised in Eq. 23.2: xi ¼ himi ; seq id; i; timestamp; motioni

ð23:2Þ

In the vector, imi is the image i; in PGM (Portable Gray Map) format and 80 64 resolution. In PGM images, every pixel is represented by an 8-bit integer. The value 0 corresponds to a black pixel, the value 255 represents a white pixel. seq id is an auto-incremented, 4-byte integer, unique for each sequence. It is used to identify which sequence the vector belongs to. The number i is an auto-incremented, 4-byte integer, unique for every vector in the sequence, used to quickly identify every image in the sequence. The timestamp is a 4-byte integer, storing Unix timestamp. It is not being used so far for navigation purposes. The character motion is a single character, identifying the type of movement the robot performed after the image was grabbed. The image alone uses 5,120 bytes. The overhead information comprises 13 additional bytes. Hence, the input vector contains a total of 5,133 bytes.

23.5 Mapping and Planning The ‘‘teach and follow’’ approach per se is very simple and powerful. But for robust navigation and route planning, it is necessary to extend the basic algorithm to perform additional tasks. For example, it is necessary to detect connection points between the paths learnt, when two or more paths cross, come together or split apart. It is also necessary to disambiguate when there are similar images or divergent paths.

23

Topological Mapping Using Vision

279

Fig. 23.5 Example of paths that have a common segment. The robot only needs to learn AB once

23.5.1 Filtering Out Unnecessary Images During learning in vision-based navigation, not every single picture needs to be stored. There are scenarios, such as corridors, in which the views are very similar for a long period of time. Those images do not provide data useful for navigation. Therefore, they can be filtered out during the learning stage, so that only images which are sufficiently different from their predecessors must be stored. That behaviour can be easily implemented using the SDM: every new image is only stored if there is no image within a predefined radius in the SDM. If the error in similarity between the new image and any image in the SDM is below a given threshold, the new image is discarded. A good threshold to use for that purpose is the memory activation radius. Because of the way the SDM works, new images that are less than an activation radius from an already stored image will be stored in the same hard locations. Therefore, they are most probably unnecessary, and can be discarded with no risk of impairing the performance of the system.

23.5.2 Detecting Connection Points Another situation in which new images do not provide useful information is the case when two paths have a common segment, such as depicted in Fig. 23.5. The figure shows two different paths, 1 and 2, in which the segment AB is common. If the robot learns segment AB for path 1, for example, then it does not need to learn it again for segment 2. When learning path number 2, it only needs to learn it until point A. Then it can store an association between paths 1 and 2 at point A and skip all the images until point B. At point B, it should again record a connection between paths 1 and 2. That way, it builds a map of the connection points between the known paths. That is a kind of topological representation of the environment. The main problem with this approach is to detect the connection points. The points where the paths come together (point A in Fig. 23.5) can be detected after a

280

M. Mendes et al.

reasonable number of images of path 1 have been retrieved, when the robot is learning path 2. When that happens, the robot stores the connection in its working memory and stops learning path 2. From that point onwards, it keeps monitoring if it is following the same path that it has learnt. After a reasonable number of predictions have failed, it adds another connection point to the graph and resumes learning the new path. In the tests with the SDM, a number of 3–5 consecutive images within the access radius usually sufficed to establish a connection point, and 3–5 images out of the access radius was a good indicator that the paths were diverging again.

23.5.3 Sequence Disambiguation One problem that arises when using navigation based on sequences is that of sequence disambiguation. Under normal circumstances, it is possible the occurrence of sequences such as (1) ABC; (2) XBZ; or (3) DEFEG, each capital letter representing a random input vector. There are two different problems with these three sequences: (1) and (2) both share one common element (B); and one element (E) occurs in two different positions of sequence (3). In the first case, the successor of B can be either C or Z. In the second case, the successor of E can be either F or G. The correct prediction depends on the history of the system. One possible solution relies on using a kind of short term memory. Kanerva proposes a solution in which the input to the SDM is not the last input Dt ; but the juxtaposition of the last k inputs fDt ; Dt1 . . .Dtk g: This technique is called folding, and k is the number of folds. The disadvantage is that it greatly increases the dimensionality of the input vector. Bose [1] uses an additional neural network, to store a measure of the context, instead of adding folds to the memory. In the present work, it seemed more appropriate a solution inspired by Jaeckel and Karlsson’s proposal of segmenting the addressing space [3]. Jaeckel and Karlsson propose to fix a certain number of coordinates when addressing, thus reducing the number of hard locations that can be selected. In the present work, the goal is to retrieve an image just within the sequence that is being followed. Hence, Jaeckel’s idea is appropriate for that purpose. The number of the sequence can be fixed, thus truncating the addressing space.

23.6 Experiments and Results For practical constraints, the experiments were performed in a small testbed in the laboratory. The testbed consisted of an arena surrounded by a realistic countryside scenario, or filled in with objects simulating a urban environment.

23

Topological Mapping Using Vision

281

23.6.1 Tests in an Arena Stimulating a Country Environment The first experiment performed consisted in analysing the behaviour of the navigation algorithm in the arena. The surrounding wall was printed with a composition of images of mountain views, as shown in Fig. 23.8. The field of view of the camera is relatively narrow (about 40°), so the robot cannot capture above or beyond the wall. Sometimes it can capture parts of the floor. Figure 23.6 shows an example of the results obtained. In the example, the robot was first taught paths L1 and L2. Then the memory was loaded with both sequences, establishing connection points A and B. The minimum overlapping images required for establishing a connection point was set to 3 consecutive images. The minimum number of different images necessary for splitting the paths at point B was also set to 3 consecutive images out of the access radius. The lines in Fig. 23.6 were drawn by a pen attached to the rear of the robot. Therefore, they represent the motion of the rear, not the centre of the robot, causing the arcs that appear when the robot changes direction. As the picture shows, the robot was able to start at the beginning of sequence L1 and finish at the end of sequence L2, and vice versa. Regardless of its starting point, at point A it always defaulted to the only known path L1. This explains the small arc that appears at point A in path F2. The arc represents an adjustment of the heading when the robot defaulted to path L1. The direction the robot takes at point B depends on the established goal. If the goal is to follow path L1, it continues along that path. If the goal is to follow path L2, it will disambiguate the predictions to retrieve only images from path L2. That behaviour explains the changes in direction that appear in the red line (F1) at point B. The arcs were drawn when the robot started at path L1, but with the goal of reaching the end of path L2.

Fig. 23.6 Results: paths taught and followed. The robot successfully switches from one path to another and node points A and B

282

M. Mendes et al.

Fig. 23.7 Typical city view, where the traffic turn is temporarily occluded by passing cars

Fig. 23.8 Paths learnt (blue and black) and followed, with small scenario changes. The robot plans correctly the routes and is immune to small changes in the reconstituted urban scenario

23.6.2 Tests in a Stimulated Urban Environment In a second experiment, the scenario was filled with images mimicking a typical city environment. Urban environments change very often. Ideally, the robot should learn one path in a urban environment but still be able to follow it in case there are small changes, up to an acceptable level. For example, Fig. 23.7 shows two pictures of a traffic turn, taken only a few seconds one after the other. Although the remaining scenario holds, one picture captures only the back of a car in background. The other picture captures a side view of another car in foreground. Due to the small dimensions of the robot, it was not tested in a real city environment, but in a reconstruction of it. Figure 23.8 shows the results. Figure 23.8a shows the first scenario, where the robot was taught. In that scenario the robot, during segment AB, is guided essentially by the image of the traffic turn

23

Topological Mapping Using Vision

283

without the car. In a second part of the same experiment, the picture of the traffic turn was replaced by the other picture with the car in foreground, and the robot was made to follow the same paths. Again, it had to start at path L1 and finish at path L2, and vice versa. As Fig. 23.8b shows, it was able to successfully complete the tasks.

23.7 Conclusions Navigation based on view sequences is still an open research question. In this paper, a novel method was proposed that can provide vision-based navigation based on a SDM. During a learning stage, the robot learns new paths. Connection points are established when two paths come together or split apart. That way, a topological representation of the space is built, which confers on the robot the ability to switch from one sequence to another and plan new paths. One drawback of this approach is that the SDM model, simulated in software as in this case, requires a lot of processing and is not fast to operate in real time if the number of images is very large. Another disadvantage is that using just front views, the robot only merges paths that come together in the same heading. That problem can be solved using metric information to calculate when the robot is in a place it has already been, even if with another heading. Another possibility is to use omnidirectional images. The results shown prove the feasibility of the approach. The robot was tested in two different environments: one that is a reconstitution of a country environment, the other a reconstitution of a changing urban environment. It was able to complete the tasks, even under changing conditions.

References 1. Bose J (2003) A scalable sparse distributed neural memory model. Master’s thesis, University of Manchester, Faculty of Science and Engineering, Manchester, UK 2. Ishiguro H, Tsuji S (1996) Image-based memory of environment. In: Proceedings of the IEEE/RSJ international conference on intelligent robots and systems 3. Jaeckel LA (1989) An alternative design for a sparse distributed memory. Technical report, Research Institute for Advanced Computer Science, NASA Ames Research Center 4. Johnson S (2004) Mind wide open. Scribner, New York 5. Kanerva P (1988) Sparse distributed memory. MIT Press, Cambridge 6. Matsumoto Y, Ikeda K, Inaba M, Inoue H (1999) Exploration and map acquisition for viewbased navigation in corridor environment. In: Proceedings of the international conference on field and service robotics, pp 341–346 7. Matsumoto Y, Inaba M, Inoue H (2000) View-based approach to robot navigation. In: Proceedings of 2000 IEEE/RSJ international conference on intelligent robots and systems (IROS 2000) 8. Matsumoto Y, Inaba M, Inoue H (2003) View-based navigation using an omniview sequence in a corridor environment. In: Machine vision and applications

284

M. Mendes et al.

9. Mendes M, Paulo Coimbra A, Crisóstomo MM (2010) Path planning for robot navigation using view sequences. In: Lecture notes in engineering and computer science: proceedings of the World Congress on engineering 2010, WCE 2010, London, UK 10. Mendes M, Crisóstomo MM, Paulo Coimbra A (2008) Robot navigation using a sparse distributed memory. In: Proceedings of the 2008 IEEE international conference on robotics and automation, Pasadena, CA, USA 11. Mendes M, Crisóstomo MM, Paulo Coimbra A (2009) Assessing a sparse distributed memory using different encoding methods. In: Proceedings of the 2009 international conference of computational intelligence and intelligent systems, London, UK 12. Meyer J (2003) Map-based navigation in mobile robots: Ii. A review of map-learning and path-planning strategies. Cogn Syst Res 4(4):283–317 13. Rasmussen C, Hager GD (1996) Robot navigation using image sequences. In: Proceedings of AAAI, pp 938–943 14. Ratitch B, Precup D (2004) Sparse distributed memories for on-line value-based reinforcement learning. In: ECML 15. Winters N, Santos-Victor J (1999) Mobile robot navigation using omni-directional vision. In: Proceedings of the 3rd Irish machine vision and image processing conference (IMVIP’99), pp 151–166

Chapter 24

A Novel Approach for Combining Genetic and Simulated Annealing Algorithms Younis R. Elhaddad and Omar Sallabi

Abstract The Traveling Salesman Problem (TSP) is the most well-known NP-hard problem and is used as a test bed to check the efficacy of any combinatorial optimization methods. There are no polynomial time algorithms known that can solve it, since all known algorithms for NP-complete problems require time that is excessive to the problem size. One feature of Artificial Intelligence (AI) concerning problems is that it does not respond to algorithmic solutions. This creates the dependence on a heuristic search as an AI problem-solving technique. There are numerous examples of these techniques such as Genetic Algorithms (GA), Evolution Strategies (ES), Simulated Annealing (SA), Ant Colony Optimization (ACO), Particle Swarm Optimizers (PSO) and others, which can be used to solve large-scale optimization problems. But some of them are time consuming, while others could not find the optimal solution. Because of this many researchers thought of combining two or more algorithms in order to improve solutions quality and reduce execution time. In this work new operations and techniques are used to improve the performance of GA [1], and then combine this improved GA with SA to implement a hybrid algorithm (HGSAA) to solve TSP. This hybrid algorithm was tested using known instances from TSPLIB (library of sample instances for the TSP at the internet), and the results are compared against some recent related works. The comparison clearly shows that the HGSAA is effective in terms of results and time.

Y. R. Elhaddad (&) O. Sallabi Faculty of Information Technology, Garyounis University, P.O. 1308, Benghazi, Libya e-mail: [email protected] O. Sallabi e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_24, Springer Science+Business Media B.V. 2011

285

286

Y. R. Elhaddad and O. Sallabi

24.1 Introduction Many problems of practical and theoretical importance within the fields of artificial intelligence and operations research are of a combinatorial nature. In these problems, there is a finite solution set X and a real-valued function f: X ? R whereby the goal is to search for a solution x* [ X with f(x*) B f(x) V x [ X. The goal of an optimization problem can be formulated as follows: rearrange control or decision variables according to some constraints in order to minimize or maximize the value of an objective function [2]. The most widely known and famous example of a combinatorial optimization problem is the Traveling Salesman Problem (TSP) [2–4]. Problem-solving is an area of Artificial Intelligence (AI) that is concerned with finding or constructing the solution to a difficult problem like combinatorial optimization problems, using AI algorithms such as Genetic Algorithms (GA), Simulated Annealing (SA), Ant Colony Optimization (ACO), Particle Swarm Optimizers (PSO), Iterated Local Search (ILS), Tabu Search (TS), and others. These can be used to solve large-scale optimization problems. But some of them are time-consuming and others could not find the optimal solution because of the time constraints. Thus many researchers thought of combining two or more algorithms in order to improve solution quality and reduce execution time. In this work, new techniques and operations are applied to GA in order to improve its performance. Then this improved GA is combined with SA, using a new approach of this combination that produces a new Hybrid Genetic and Simulated Annealing Algorithm (HGSAA). The proposed algorithm was tested using symmetric TSP instances from known TSPLIB [5], and the results show that the algorithm is able to find an optimal solution or near optimal solution for varying sizes of these instances.

24.2 The Travelling Salesman Problem Travelling Salesman Problem (TSP) is a classic case of a combinatorial optimization problem and is one of the most widely known Non deterministic Polynomial (NP-hard) problems [3]. The travelling salesman problem is stated as follows: given a number of cities with associated city to city distances, what is the shortest round trip tour that visits each city exactly once and then returns to the start city [6]. The TSP can be also stated as, given a complete graph, G, with a set of vertices, V, a set of edges, E, and a cost, cij associated with each edge in E, where cij is the cost incurred when traversing from vertex i 2 V to vertex j 2 V, a solution to the TSP must return the minimum distance Hamiltonian cycle of G. A Hamiltonian cycle is a cycle that visits each node in a graph exactly once and returns to the starting node. This is referred to as a tour in TSP terms. The real problem is to decide in which order to visit the nodes. While easy to explain, this problem is not always easy to solve. There are no known polynomial time algorithms that can solve TSP. Therefore it is classified as an NP-hard problem. The TSP became

24

A Novel Approach for Combining Genetic and SA Algorithms

287

Fig. 24.1 Comparison of rate conversion for IGA and HGSAA

popular at the same time the new subject of linear programming arose along with challenges of solving combinatorial problems. The TSP expresses all the characteristics of combinatorial optimization, so it is used to check the efficacy of any combinatorial optimization method and is often the first problem researchers use to test a new optimization technique [7]. Different types of TSP can be identified by the properties of the cost matrix. The repository, TSPLIB, which is located at [5], contains many different types of TSP, and related problems. This thesis deals with symmetric TSP of type ECU_2D, where in symmetric (STSP) cij ¼ cji 8i; j; otherwise this set of problems is referred to as asymmetric (ATSP). The data of STSP given at TSPLIB contains the problem name (almost the name followed by the number of cities in the problem, e.g. kroA100, and rd100 both contain 100 cities in the problems). The data also provides the user with an array n 3 where n is the number of cities and the first column is the index of each city. Columns two and three are the positions of the city on the x-axis and the y-axis. Assuming that each city in a tour is marked by its position (xi, yi) in the plane (see Fig. 24.1), and the cost matrix c contains the Euclidean distances between the ith and jth city: qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ð24:1Þ cij ¼ ðxi xj Þ2 þ ðyi yj Þ2

288

Y. R. Elhaddad and O. Sallabi

The objective of TSP is to minimize the function f, where f ¼

n1 X

ci;iþ1 þ c1;n

ð24:2Þ

i¼1

The search space of a Euclidean TSP of N cities contains N! permutations. The objective is to find a permutation of the N cities that has minimum cost. For a symmetric problem with n cities there are ðn 1Þ!=2 possible tours.

24.3 Genetic Algorithm Evolutionary computation (EC) is based on biological evolution processes of living organisms, according to evolution theory of natural selection and survival of the fittest. EC consists of a population of individuals (solutions for a problem), performing iteratively. Operations such as reproduction, recombination, mutation and selection, result in the ‘‘survival of the fittest,’’ or the best solution occurring in the population of solutions. Genetic algorithms (GAs) are a specific type of Evolutionary Algorithm (EA). GAs will be the center of attention appearing to be the best suited evolutionary algorithms for combinatorial optimization problems. The power of GAs comes from their reliable, robust optimization method and applicability to a variety of complex problems. In general GAs can be described as follows: Genetic algorithms start with generating random populations of possible solutions. Each individual of the population is represented (coded) by a DNA string, called a chromosome, and the chromosome contains a string of problem parameters. Individuals from the population are selected based on their fitness values. The selected parents are recombined to form a new generation. This process is repeated until some termination condition is met.

24.4 Simulated Annealing The purpose of physical annealing is to accomplish a low energy state of a solid. This is achieved by melting the solid in a heat bath and gradually lowering the temperature in order to allow the particles of the solid to rearrange themselves in a crystalline lattice structure. This structure corresponds to a minimum energy state for the solid. The initial temperature of the annealing process is the point at which all particles of the solid are randomly arranged within the heat bath. At each temperature, the solid must reach what is known as thermal equilibrium before the cooling can continue [8]. If the temperature is reduced before thermal equilibrium is achieved, a defect will be frozen into the lattice structure and the resulting crystal will not correspond to a minimum energy state.

24

A Novel Approach for Combining Genetic and SA Algorithms

289

The Metropolis Monte Carlo simulation can be used to simulate the annealing method at a fixed temperature T. The Metropolis method randomly generates a sequence of states for the solid at the given temperature. A solid’s state is characterized by the positions of its particles. A new state is generated by small movements of randomly chosen particles. The change in energy DE caused by the move is calculated and acceptance or rejection of the new state as the next state in the sequence is determined according to Metropolis acceptance condition. If DE\0 the move is acceptable and if DE [ 0 the move is acceptable with probDE ability, if e t [ X . The move is acceptable otherwise rejected, where X is random number and 0\X\1. Simulated annealing algorithms have been applied to solve numerous combinatorial optimization problems. The name and idea of SA comes from annealing in metallurgy, a technique involving heating and controlled cooling of a material to increase the size of its crystals and reduce their defects. The heat frees the atoms to move from their initial positions (initial energy). By slowly cooling the atoms the material continuously rearranges, moving toward a lower energy level. They gradually lose mobility due to the cooling, and as the temperature is reduced the atoms tend to crystallize into a solid. In the simulated annealing method, each solution s in the search space is equivalent to a state of a physical system and the function f(s) to be minimized is equivalent to the internal energy of that state. The objective is to minimize the internal energy as much as possible. For successful annealing it is important to use a good annealing schedule, reducing the temperature gradually. The SA starts from a random solution xp , selects a neighboring solution xn and computes the difference in the objective function values, Df ¼ f ðxn Þ f xp . If the objective function is improved (Df \0), then the present solution xp is replaced by the new one xn; otherwise the solution that decreases the value of the objective function with a probability DE pr ¼ 1=ð1 þ e t Þ is accepted, where pr is decreased as the algorithm progresses, and where (t) is the temperature or control parameter. This acceptance is achieved by generating a random number (rnÞ where ð0 rn 1Þ and comparing it against the threshold. If pr [ rn then the current solution is replaced by the new one. The procedure is repeated until a termination condition is satisfied.

24.5 Improved Genetic Algorithm Technique Crossover is the most important operation of GA. This is because in this operation characteristics are exchanged between the individuals of the population. Accordingly (IGA) is concerned with this operation more than population size, thus the initial population consists of only two individuals, applying Population Reformulates Operation (PRO). Multi-crossovers are applied to these individuals to produce 100 children with different characteristics inherited from their parents, making ten copies of these children. Multi-mutation is applied, where each copy mutates with each method, evaluating the fitness function for each individual,

290

Y. R. Elhaddad and O. Sallabi

selects the best two individuals, and then finally applies the Partial Local Optimal (PLO) mutation operation to the next generation. In the technique used for IGA the tour was divided into three parts with two randomly selected cut points (p1 and p2 ). The head contains ð1; 2; . . .; p1 1Þ, the middle contains ðp1 ; p1 þ 1; . . .; p2 Þ, and the tail contains (p2 þ 1; p2 þ 2; . . .; nÞ. Using multi-crossover the head of the first parent is changed with the tail of the second parent. The middle remains unchanged, until partial local optimal mutation operation is applied which improves the middle tour by finding its local minima. The role of population reformulates operation is to change the structure of the tour by changing the head and the tail with the middle. In this technique the procedure ensures that new cities will be at the middle part of each cycle ready for improvement.

24.5.1 Multi-Crossover Operation Crossover is the most important operation of GA because it exchanges characteristics between the individuals, and according to that many types of crossover operations are used to produce offspring with different attributes in order to build up an overall view of the search space. Multi-crossover works as mentioned below. The basic principle of this crossover is two random cut points (p1 and p2 ), a head, containing ð1; 2; . . .; p1 1Þ, the middle containing ðp1 ; p1 þ 1; . . .; p2 Þ, and the tail containing (p2 þ 1; p2 þ 2; . . .; nÞ. The head and tail of each parent are flipped, and then the head of the first parent is swapped with the tail of the other parent, and vice versa. For example, if the selected random two crossover points are p1 ¼ 4 and p2 ¼ 7, and two parents tours are: head1

mid1

tail2

head2

mid2

tail1

zﬄﬄﬄﬄﬄ}|ﬄﬄﬄﬄﬄ{ zﬄﬄﬄﬄﬄﬄﬄﬄﬄ}|ﬄﬄﬄﬄﬄﬄﬄﬄﬄ{ zﬄﬄﬄﬄﬄﬄﬄ}|ﬄﬄﬄﬄﬄﬄﬄ{ Parent1 ! 9 1 5 7 4 8 6 2 10 3 zﬄﬄﬄﬄﬄ}|ﬄﬄﬄﬄﬄ{ zﬄﬄﬄﬄﬄﬄﬄﬄﬄ}|ﬄﬄﬄﬄﬄﬄﬄﬄﬄ{ zﬄﬄﬄﬄﬄﬄﬄ}|ﬄﬄﬄﬄﬄﬄﬄ{ Parent2 ! 2 8 5 6 3 1 4 7 10 9 For a valid tour the elements of head2 and tail2 are removed from the parent1 to give mid1 mid1

zﬄﬄﬄﬄﬄﬄﬄﬄﬄ}|ﬄﬄﬄﬄﬄﬄﬄﬄﬄ{ 1 4 6 3 In the same way, elements of head1 and tail1 are removed from the parent2 to give mid2 mid2

zﬄﬄﬄﬄﬄﬄﬄﬄﬄ}|ﬄﬄﬄﬄﬄﬄﬄﬄﬄ{ 8 6 4 7

24

A Novel Approach for Combining Genetic and SA Algorithms

291

Step 1 If the parts (head2, mid1, tail2) are reconnected using all possible permutations, six different children can be obtained (3!). child1 ! 2

8 5

1

4

6 3

7

10

9

In the same way for (head1, mid2, tail1), six other children are produced: i.e. child2 ! 9

1 5

8

6

4 7

2

10

3

Step 2 If the two heads are flipped, as in step 1, 12 new different children are produced: child3 ! 5

8 2

1

4

6 3

7

10

9

child4 ! 5

1 9

8

6

4 7

2

10

3

Step 3 If the two tails are flipped and as in step 1, 12 new different children are produced: child5 ! 2

8 5

1

4

6 3

9

10

7

child6 ! 9

1 5

8

6

4 7

3

10

2

Step 4 If the two mid are flipped and as in step 1; 12 new different children are produced: child7 ! 2

8 5

3

6

4 1

7

10

9

child8 ! 9

1 5

7

4

6 8

2

10

3

Step 5 If the two heads and tails are flipped and as in step 1, 12 new different children are produced: child9 ! 5

8 2

1

4

6 3

9

10

7

child10 ! 5

1 9

8

6

4 7

3

10

2

In each step 12 children are produced; therefore 5 ð3!Þ 2 ¼ 60 completely different children are produced from just two parents.

24.5.2 Selection Operation Using the rank selection, the best two individuals are selected for the next operations in order to reduce the execution time.

24.5.3 Mutation The inversion mutation operation is used here, where random subtour is selected from the second individual then is inversed.

292

Y. R. Elhaddad and O. Sallabi

24.5.4 The Rearrangement Operation This operation is applied to both individuals. ci;j is the cost between the two adjacent cities cityi and cityj , where i ¼ 1; 2; 3; . . .; n 1 and j ¼ i þ 1. The aim of this operation is to find the greatest (max) value of ci;j among all the adjacent cities on the tour, and then swap cityi with three other cities, one at a time. These cities are located on three different positions on the tour (beginning, middle, and end). The best position, as well as the original position will be accepted. This operation works in a random matter, and while it may not achieve any improvement after several iterations, it might instead (or is just as likely to) take a big jump and improve the result.

24.5.5 Partial Local Optimal Mutation Operation In this operation, the subtour of individuals is selected randomly within the range of 3 B size of subtour \ n/4. We then find the tour that produces the local minima of this subtour and exchange it with the original subtour. This operation is undertaken on one of the selected individuals after the mutation operation is performed.

24.6 The Proposed Hybrid Algorithm (HGSAA) The proposed HGSAA is designed by combining the IGA and SA in order to reap the benefits of SA and reduce the time that IGA spends stuck at local minima. Initial temperature of SA is set at a small value, 80, because the number of cycles SA will perform is only ten cycles. Thus this temperature will ensure that SA can reach the state of equilibrium within these cycles. The hybrid algorithm starts with a random population. It will use the input of the GA, and multi-crossover is then applied to produce 60 different children. The parents’ and their offspring’s fitness will be calculated and depending on the results of this calculation a new population will be selected that is the same size as the original population. A partial local optimal mutation operation will then be applied to one individual (according to mutation probability) in order to improve its fitness value. The rearrangement operation is also used on the population. This process is continued until there is no improvement in the results after ten consecutive iterations. The memorized population from GA which provides the best result will then be transferred to the SA. The SA processes will be used to improve the results by using the nearest solution technique. If results are no longer improved within ten consecutive iterations, then the best memorized population from the SA will be moved to the GA to repeat the above process. Figure 24.1 shows the conversion rate of HGSAA and IGA for the

24

A Novel Approach for Combining Genetic and SA Algorithms

293

Table 24.1 Results of HGSAA Problem Optimal Best result

Iteration

Time sec.

Average

St. dev.

Error (%)

eil101 ch130 ch150 korA100 kroA150 kroA200

400 500 750 (292) 400 (171) 800 (407) 1100

17 (15) 26 46 (18) 18 (7) 53 (27) 85

632.9 6146.7 6540.4 21319.8 26588.7 29434.9

2.8 14.8 13.9 32.5 62.3 45.7

0 0.6 0 0 0 0.23

629 6110 6528 21282 26524 29368

629 6126 6528 21282 26524 29382

dsj1000 problem from TSLIB [5]. Unlike the curve of HGSAA the curve of the IGA is stuck and the result is steady at many positions of the curve during its process. In other words in HGSAA the SA is causing the algorithm to be stuck for a long time and improves the results faster than the GA does.

24.7 Experimental Results of HGSAA The following sections will discuss the results of experiments and compare them with some recently related work which used hybrid genetic algorithms to solve TSP.

24.7.1 Comparison with LSHGA Instances that are 100 cities from TSPLIB [5] and used by Zhang and Tong [9] are used. The same number of generations for each instance is used in order to compare the results of HGSAA and the local search heuristic genetic algorithms LSHGA [9]. The HGSAA was run for ten trials corresponding to each instance, and the summarized results are shown in Table 24.1 where column 2 shows the known optimal solutions; column 3 shows the best result obtained by the HGSAA; column 4 indicates the number of generations performed, with the number of generations needed to obtain the optimal result in parentheses; column 5 indicates the time in seconds used for each instance, with the time to obtain the optimal result in parentheses; column 6 shows the average of the ten results for each instance; column 7 shows the standard deviation of the ten results for each instance; and column 8 shows the error ratio between the best result and the optimal, which is calculated according to Eq. 24.3. The results of LSHGA are summarized in Table 24.2 The notations, PS, CN, OS and error, denotes the population size of the algorithm, the convergence iteration number, the best solution of the LSHGA, and the error respectively. Errors are calculated according to Eq. 24.3.

294

Y. R. Elhaddad and O. Sallabi

Table 24.2 Results of LSHGA Problem PS

CN

BS

Error (%)

eil101 ch130 ch150 korA100 kroA150 kroA200

400 500 750 400 800 1100

640 6164 6606 21296 26775 29843

1.75 0.88 1.19 0.66 0.95 1.62

300 350 400 300 450 500

Table 24.3 Results of HGSAA and HGA Instance Optimal

HGSAA

HGA35

Eil51 Eil76 Eil101 KroA100 KroD100 D198 kroA200

426 (428.98) 538 (544.37) 629 (640.2116) 21282 21294 15781 29368

426 (428.87) 538 (544.37) 629 (640.975)* 21282 21306 15788 29368

426 538 629 21282 21294 15780 29368

Error ¼

average optimal 100: optimal

ð24:3Þ

From Tables 24.1 and 24.2 it is clear that the HGSAA performed better than the LSHGA. The HGSAA can find the optimal solution for four instances out of six, while LSHGA cannot find an optimal solution for any of the six instances. The error ratios in both tables indicate that the HGSAA performs much better than the LSHGA.

24.7.2 Comparison with HGA The HGSAA has been compared to the HGA proposed by Andal Jayalakshmi et al. [10]. The HGSAA runs seven known instances of TSPLIB [5], ten trails for each one, same as the work done at [10]. The HGSAA used the integer and real tours eil51, eil76, and eil101. In Table 24.3 column 2 shows the known optimal solutions, column 3 shows the best result obtained by the HGSAA, the real number is in parenthesis; and column 4 indicates the best result of HGA from [10]. The comparison of the results summarized in table [3] shows that HGSAA obtained better results than the HGA. For real tours, for instance eil101, a new best result is obtained by HGSAA, where formerly the best known result was reported by [10].

24

A Novel Approach for Combining Genetic and SA Algorithms

Table 24.4 Results of HGSAA and SAGA Problem HGSAA dsj1000 d1291 fl1400 fl1577 pr2392

295

SAGA

Avg.

Std. dev.

Avg.

Std. dev.

1.72 1.91 0.43 0.92 6.37

0.19 0.596 0.38 0.41 0.36

2.27 3.12 0.64 0.64 6.53

0.39 1.12 0.55 0.55 0.56

24.7.3 Comparison with SAGA Stephen Chen and Gregory Pitt [11] proposed hybrid algorithms of SA and GA and they used large scale TSP. All of these instances were larger than 1,000 cities. Table 24.4 shows the average error from the optimal solution for each instance and the standard deviation of both HGSAA and SAGA. The termination condition for the HGSAA is set to be 7200 s for all except both fl1577 and pr2392 problems where the time for both is set to be 10800 s.

24.8 Conclusion and Future Work As a scope of future work, possible directions can be summarized in the following points: • To assess the proposed HGSAA, more empirical experiments may be needed for further evaluation of the algorithm. The announced comments may increase the effectiveness of the algorithm, thus should be discussed and taken into consideration. • The data structures of the HGSAA algorithm can be refined. Therefore the execution time may be further reduced. • Genetic Algorithms can be hybridized with another heuristic technique for further improvement of the results. • The presented algorithm can be used to solve different combinational problems such as DNA sequencing.

References 1. Elhaddad Y, Sallabi O (2010) A new hybrid genetic and simulated annealing algorithm to solve the traveling salesman problem. In: Lecture notes in engineering and computer science: proceedings of the world congress on engineering 2010, vol I WCE 2010, June 30–July 2, London, UK, pp 11–14

296

Y. R. Elhaddad and O. Sallabi

2. Lawle EL (1976) Combinatorial optimization: networks and matroids. Holt, Rinehart, and Winston, New York 3. Larranaga P, Kuijpers CM, Murga RH, Inza I, Dizdarevic S (1999) Genetic algorithms for the travelling salesman problem: a review of representations and operators. CiteSeerX. citeseer.ist.psu.edu/318951.html. Accessed Nov 19, 2007 4. Fredman M et al (1995) Data structures for traveling salesmen. AT&T labs—research. www.research.att.com/*dsj/papers/DTSP.ps. Accessed Feb 13, 2008 5. Heidelberg University. http://www.iwr.uni-heidelberg.de/groups/comopt/software/TSPLIB95/. Accessed Jan 22, 2007 6. Mitchell G, O’Donoghue D, Trenaman A (2000) A new operator for efficient evolutionary solutions to the travelling salesman problem. LANIA. www.lania.mx/*ccoello/mitchell00. ps.gz. Accessed Aug 22, 2007 7. Bhatia K (1994) Genetic algorithms and the traveling salesman problem. CiteSeer. http://citeseer.comp.nus.edu.sg/366188.html. Accessed Feb 26, 2008 8. Metropolis N et al (1953) Equation of state calculations by fast computing machines. Florida State University. www.csit.fsu.edu/*beerli/mcmc/metropolis-et-al-1953.pdf. Accessed Feb 17, 2008 9. Zhang J, Tong C (2008) Solving TSP with novel local search heuristic genetic algorithms. IEEE_explore. http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=4666929&isnumber=4666792. Accessed Jan 12, 2009 10. Jayalakshmi G, Sathiamoorthy S, Rajaram R (2001) A hybrid genetic algorithm—a new approach to solve traveling salesman problem. CiteSeer. http://citeseerx.ist.psu.edu/viewdoc/ summary?doi=10.1.1.2.3692. Accessed Jan 14, 2008 11. Chen S, Pitt G (2005) Isolating the benefits of respect. York University. http://www. atkinson.yorku.ca/*sychen/research/papers/GECCO-05_full.pdf. Accessed Jan 5, 2009

Chapter 25

Buyer Coalition Formation with Bundle of Items by Ant Colony Optimization Anon Sukstrienwong

Abstract In electronic marketplaces, there are several buyer coalition schemes with the aim of obtaining the best discount and the total group’s utility for buying a large volume of products. However, there are a few schemes focusing on a group buying with bundles of items. This paper presents an approach called GroupBuyACO for forming buyer coalition with bundle of items via the ant colony optimization (ACO). The concentration of the proposed algorithm is to find the best formation of the heterogeneous preference of buyers for earning the best discount from venders. The buyer coalition is formed concerning the bundles of items, item price, and the buyer reservations. The simulation of the proposed algorithm is evaluated and compared with the GAGroupBuyer scheme by Sukstrienwong (Buyer formation with bundle of items in e-marketplaces by genetic algorithm. Lecture note in engineering and computer science: proceedings of the international multiconference of engineers and computer scientists 2010, IMECS 2010, 17–19 March 2010, Hong Kong, pp 158–162). Experimental Results indicate that the algorithm can improve the total discount of any coalitions.

25.1 Introduction At present, an electronic commerce is becoming a necessary tool for many companies to sell their products because it is one of the fastest ways to advertise the product’s information to the huge number of customers. Tons of products can

A. Sukstrienwong (&) Information Technology Department, School of Science and Technology, Bangkok University, Bangkok, Thailand e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_25, Springer Science+Business Media B.V. 2011

297

298

A. Sukstrienwong

be sold rapidly in few days. So, the companies can get better profits from selling a large number of products. Ordinarily, many sellers provide some attractive products with the special prices. One of the strategies which sellers prefer to make is selling their goods in bundles of item1 with the special prices. Moreover, several commercial websites such as http://www.buy.yahoo.com.tw/ and https://www. shops.godaddy.com/ usually offer the volume discount for customers if the number of selling is big. For buyer side, most of the buyers prefer to build the corresponding purchasing strategies to reduce the amount of purchase cost. For this reason, the buyer strategy becoming rapidly popular on the Internet is a buyer coalition formation because buyers can improve their bargaining power and negotiate more advantageously with sellers to purchase goods at a lower price. In the recent years, several existing buyer coalition schemes in electronic marketplaces have been developed. The main objective of these schemes is to gather all buyers’ information for forming a buyer coalition to purchase goods at low cost. It helps to reduce the cost of communication and makes buyers comfortable in joining a coalition. The work of Ito et al. [10] presented an agent-mediated electronic market by group buy scheme. Buyers or sellers can sequentially enter into the market to make their decisions. The work of Tsvetovat et al. [18] has investigated the use of incentives to create buying group. Yamamoto and Sycara [20] presented the GroupBuyAction scheme for forming buyer coalition based on item categories. Then, the paper of Hyodo et al. [8] presented an optimal coalition formation among buyer agents based on genetic algorithms (GAs) with the purpose of distributing buyers among group-buying site optimally to get good utilities. The Combinatorial Coalition Formation scheme Li and Sycara [13] considers an e-marketplace where sellers set special offers based on volume. And, buyers place a bid on a combination of items with the reservation prices which is the maximum price that a buyer is willing to pay for an item of goods. In the work of Mahdi [14], GAs are applied for negotiating intelligent agents in electronic commerce using a simplified standard protocol. However, there are few schemes such as GroupPackageString scheme by Sukstrienwong [16] and GroupBuyPackage scheme by Laor et al. [11] that focused on a buyer coalition with bundles of items. Only the GroupPackageString scheme applied by using GAs to forms the buyer coalition with bundles of items. In the corresponding conference paper, Sukstrienwong [17], to this paper, further results are found. The proposed approach applies ACO technique for forming buyer coalitions with the aim at maximizing the total discount. The paper is divided into five sections, including this introduction section. The rest of the paper is organized as follow. Section 25.2 outlines group buying with bundle of items and the motivating problem. Section 25.3 presents the basic concept of ACO and problem formulization to buyer formation with bundles of items. The experimental results of the simulation of the GroupBuyACO algorithm are in Sect. 25.4. The conclusions and future works are in last section.

1

Bundle of items in the work of Gurler et al. [6] refers to the practice of selling two or more goods together in a package at a price which is below the sum of the independent prices.

25

Buyer Coalition Formation with Bundle of Items by ACO

299

Table 25.1 An example of price lists Sellers Package numbers Product types s0

s1

s3

package11 package12 package13 package14 package21 package22 package23 package24 package31 package32 package33

Price ($)

Toilet paper

Paper tower

Lotion

Detergent

pack of 1 – – Pack of 1 – – – – – – –

– pack of 1 pack of 3 Pack of 6 – – – – pack of 1 – pack of 3

– – – – pack of 1 – – pack of 1 – – –

– – – – pack pack pack – pack pack

of 1 of 4 of 8 of 1 of 1

8.9 14.0 32.5 50.9 10.5 19.0 67.0 92.0 14.0 19.0 49.5

25.2 Outline the Group Buying with Bundle of Items In electronic marketplaces, sellers have more opportunity to sell their products in a large number if their websites are very well-known among buyers. Moreover, the pricing strategy is one of the tools for sellers that might expedite the selling volume. Some sellers simultaneously make a single take-it-or-leave-it price offer to each unassigned buyer and to each buyer group defined by Dana [2]. In this paper, I assume that the buyer group is formed under one goal to maximizing aggregate buyer’s utility, the price discount received by being members of a coalition. Additionally, the definition of bundles of items is a slightly difference from the work of Gurler et al. [6]; in this paper, it refers to several items together in a package of one or more goods at one price. The discount policy of sellers based on the number of items bundled in the package. If the package is pure bundling, the average price of each item will be cheaper than the price of a single-item package. Suppose three sellers are in the e-marketplace selling some similar or the same products. Sellers prepare a large stock of goods and show the price list for each product. In this paper, I assume the agents are self-automate and be able to form coalitions when such a choice is beneficial. The example of three sellers’ information is shown in Table 25.1. First seller, called s1, is selling two sizes of facial toner, 100 and 200 cc. To get buyer attraction the seller s1 has made the special offers. The Seller s1 offers a package of number p13 with the price of $32.0. The package p13 composes of three bottle of facial toner (200 cc). The average price of each facial toner (200 cc) is about 32.0/3 = 10.67 dollars/bottle which is 14.0 - 10.67 = 3.33 dollars/bottle cheaper than a sing-bottle of facial toner (200 cc) in package p12 . At the same time, the third Seller called s3 offers package p33 which comprises of three bottles of facial toner (200 cc) and 1 bottle of body lotion (250 cc) at the price of $49.5. However, a single bottle of facial toner (200 cc) and body lotion (250 cc) are set individually in the package p12 at the price

300

A. Sukstrienwong

Table 25.2 An example of buyer’s orders with the reservation price bBuyers Buyer’s order (number of items reservation prices $) Facial toner

1

b2 b3 b4

Body lotion

100 cc

200 cc

1,500 cc

250 cc

– – – 1 9 (8.0)

1 9 (9.0) – – 4 9 (11.0)

– – 3 9 (6.0) –

1 9 (10.5) 2 9 (10.95) 1 9 (6.0) –

of $14.0 and the package p22 at the price of $19.0. Suppose there are some buyers who want to purchase some products listed in the Table 25.1. In the heterogeneous preference of buyers, some buyers do not want to purchase the whole bundle of items by their own. Buyers only need to buy a few items of products. Suppose a buyer called b1 who wants to purchase a bottle of facial toner (200 cc) and a bottle of body lotion (250 cc) as shown in Table 25.2. Typically, buyers have seen the price lists provided by all sellers before making their orders. The problem of buyer b1 is described as follows. If buyer b1 goes straight to purchase those products by his own, the total cost that buyer b1 needs to pay is 14.0 ? 19.0 = 33.0 dollars which is the highest price at that time. So, the buyer b1 comes to participate in the group buying with the aim of obtaining better prices on the purchasing. Then, buyer b1 places the orders to specific items with the reservation prices of $9.0 for facial toner (200 cc) and $10.5 a bottle of body lotion (250 cc).

25.3 Ant Colony Optimization for Buyer Coalition with Bundles of Items 25.3.1 The Basic Concept of ACO The algorithm is based on an imitation of the foraging behavior of real ants as described in the work of Goss et al. [5]. Ant colony optimization (ACO) algorithms are inspired by the behavior of real ants for finding good solutions to combinatorial optimization. The first ACO algorithm was introduced by Dorigo and Gambardella [3] and Dorigo and Di Caro [4] which known as ant system (AS). ACO have applied to classical NP-hard combinatorial optimization problems, such as the traveling salesman problem in the work of Lawler et al. [12], the quadratic assignment problem (QAP) by Maniezzo et al. [15], the shop scheduling problem, and mixed shop scheduling by Yamada and Reeves [19]. The application of ACO appears in various fields. In the work of Ismail et al. [9], this paper presents the economic power dispatch problems solved using ACO technique. And, Alipour et al. [1] has proposed an algorithm based on ACO to enhance the quality of final fuzzy classification system.

25

Buyer Coalition Formation with Bundle of Items by ACO

301

In nature, real ants are capable of finding the shortest path from a food source to their nest without using visual cues shown by Hölldobler and Wilson [7]. In ACO, a number of artificial ants build solutions to an optimization problem while updating pheromone information on its visited tail. Each artificial ant builds a feasible solution by repeatedly applying a stochastic greedy rule. While constructing its tour, an ant deposits a substance called pheromone on the ground and follows the path by previously pheromone deposited by other ants. Once all ants have completed their tours, the ant which found the best solution deposits the amount of pheromone on the tour according to the pheromone trail update rule. The best solution found so far in the current iteration is used to update the pheromone information. The pheromone sij , associated with the edge joining i and j, is updated as follow: sij

ð1 qÞ sij þ

m X

Dskij ; ;

ð25:1Þ

k¼1

where q is the evaporation rate which q [ (0,1] the reason for this is that old pheromone should not have too strong an influence on the future. And Dskij is the amount of pheromone laid on edge (i, j) by an ant k: Q=Lk if edgeði; jÞ is used by the ant k k ð25:2Þ Dsij ¼ 0 otherwise; where Q is a constant, and Lk is the length of the tour performed by the ant k. In constructing a solution, it starts from the starting city to visit an unvisited city. When being at the city i, the ant k selects the city j to visit through a stochastic mechanism with a probability pkij given by: 8 b < P saij gij if j 2 Nkl sa g b ð25:3Þ pkij ¼ cij 2N k il il l : 0 otherwise; where Nki is a set of feasible neighborhood of ant k, representing the set of cities where the ant k has not been visited. a and b are two parameters which determine the relative influence of pheromone trail and heuristic information, and gij , which is given by gij ¼

1 ; dij

ð25:4Þ

where dij is the length of the tour performed by ant k between cities i and j.

25.3.2 Problem Formalization There is a set of sellers on the Internet called S = {s1, s2, …, sm} offering to sell a partial or all goods of G = {g1, g2, …, gj}. Let B = {b1, b2, …, bn} denoted the

302

A. Sukstrienwong

collection of buyers. Each buyer wants to purchase several items posted by some sellers in S. The seller i has made special offers within a set of packages, denoted as PACKAGEi ¼ fpackagei1 ; packagei2 ; . . .; packageik g. The average price of goods per item is a monotonically decreasing function when the size of the package is increasing big. A PACKAGEi is associated with the set of prices, denoted PRICEi ¼f pricei1 ; pricei2 ; ..., priceik g , where priceik is the price of packageik which i;k i;k is the combination of several items defined as packageik ¼ fgi;k 1 ; g2 ; . . .; gj g, i;k i;k i gj;k j 0. If any goods gj is not bundled in the packagek , then gj ¼ 0. Additionally, the product price of any seller, called sm, is a function of purchased quantity, denoted pm(q), where q is the quantity of the product. The product price function is a monotonically decreasing function, dpm ðqÞ=dq\0. If a buyer called bm needs to buy some particular items offered by sellers in S, the buyer bm places the order denoted as m m m Qm ¼ fqm 1 ; q2 ; . . .; qj g, where qj is the quantity of items gj requested by the buyer m bm. If qj ¼ 0, it implies that the buyer bm does have no request to purchase goods gj. Additionally, the buyer bm must put his reservation price for each goods associated m m m with Qm, denoted as RSm ¼ frsm 1 ; rs2 ; . . .; rsj g where rsh 0; 0 h j. In this m paper, I assume that all of buyer reservation prices rsh of each item are higher than or equal to the minimum price sold by sellers. The objective of the problem is to find best utility of the coalition; the following terms and algorithm processes are needed to define. The coalition is a temporary alliance of buyers for a purpose of obtaining best utility. The utility of the buyer bm gained from buying qm d items of gd at the priced m m is rsd priced qd . The total utility of the buyer bm is j X

m ðrsm d priced Þqd :

ð25:5Þ

d¼1

Then the total utility of the group is defined as follow: U¼

j XX

ðrsbdm priced Þqbdm ;

ð25:6Þ

bm 2B d¼1

where j ¼ jGj.

25.3.3 Forming Buyer Group with Bundles of Items by Ants The proposed algorithm presented in this paper provides means for buyer coalition formation by ACO. There some restrictions to this paper. Buyers are quoted a buyer-specific price after they have seen the price list of all packages provided by sellers. The buyer coalition is formed concerning only the price attribute. And, the price per item is a monotonically decreasing function when the size of the package

25

Buyer Coalition Formation with Bundle of Items by ACO

303

Fig. 25.1 Representing the work of one ant for creating the trail of \3 package12 2package23 …[

is increasing big. Additionally, the rule of the coalition is that each buyer is better forming a group than buying individually. The buyer coalition could not be formed if there is no utility earned from forming the group buyer. The first step for forming buyer coalition with bundles of items is to represent the problem as a graph where the optimum solution is a certain way through this graph. In Fig. 25.1, the solid line represents a package selected by the ant k. If the selected package is picked more than one, the ant k moves longer along the solid line. Then, the ant k deposits and updates the pheromone on the selected number of the specific package. In this particular problem, the ant randomly chooses the other package which is represented by a dotted line. The probability of selecting i units of packages jth is pkij formally defined below: pkij

¼

8 < :

sa n

b

P Pij j ij i

0

b sa n 12D il il

if j 2 D; the set of packages offered by all sellers which have not been selected; otherwise,

ð25:7Þ where Dskij is the intensity of the pheromone on the solid line. For instance, at the starting point if the ant k has selected three sets (j = 3) of package12 , the current ant deposits its pheromone only on the package12 , at the unit of 3. The ant k keeps moving along the path until all of the buyers’ requests are matched. The possible resulting of the algorithm is shown in the Fig. 25.1. The quantity of pheromone Dskij is defined as follow: Q=U k if i units of package j is used by the ant k ð25:8Þ Dskij ¼ 0 otherwise where Q is equal to one, and U k is the total utility of a coalition derived from the ant k. Keep in mind that the gij is given by

304

A. Sukstrienwong

8P < mij =uij if some items in theselected package are unmatched to thebuyers’requests; gij ¼ 1 if all items in the selected package are totally matched to thebuyers’ request : 0 otherwise

ð25:9Þ where mij is the total number of items in the selected packages which is matched to the buyer’s requests, and uij is the total number of items in the selected package. At the beginning all of the pheromone values of each package line are initialized to the very small value c, 0 \ c B 1. After initializing the problem graph with a small amount of pheromones and defining each ant’s starting point, a small number of ants run for a certain number of iterations. For every iteration, each ant determines a path through the graph from its starting point to the solid package line. The measurement of the quality of a solution found by the ACO is calculated according to the total utility of coalitions in Eq. 25.6.

25.3.4 GroupBuyACO Algorithm

25.7

25.8

25.9

25

Buyer Coalition Formation with Bundle of Items by ACO

305

Table 25.3 Data settings for GroupBuyACO algorithm Constant Detail

Value

NumOfBuyer NumOfSeller MaxNumPackageSeller NumOfTypeInPackage

10 3 5 4

No. of buyers No. of sellers Max no. of packages for each seller No. of product type in pacakage

This section shows the implementation of ACO algorithm for forming buyer group with bundles of items called the GroupBuyACO algorithm. The proposed algorithm can be described by the following algorithm:

25.4 Experimental Results This section demonstrates the initial data setting of the simulation for forming buyer coalition by the proposed algorithm, GroupBuyACO algorithm. The algorithm has tried several of runs with different numbers of artificial ants, values of a and b, and evaporation rate (q) to find which values would steer the algorithm towards the best solution.

25.4.1 Initial Data Settings The experimental results of the proposed algorithm are derived from a simulation which has implemented more than 4,000 lines of C++ program. It is run on a Pentium(R) D CPU 2.80 GHz, 2 GB of RAM, IBM PC. The simulation program for the GroupBuyACO algorithm is coded in C++ programming language. Table 25.3 summarizes the initial data settings for GroupBuyACO algorithm in the simulation. In order to get the best experimental results, for this example, the buyers’ orders with the reservation price are selected randomly to demonstrate that the proposed algorithm is possible to works in the real-world data. Three different sellers offer to sell various packages which are pure bundling packages. Table 25.4 shows the products and price list offered by individual seller. Seller s1 offers six packages. First four packages are one-item package. The rest are two-item package. The average number of items per package of s1 is ð4 1 þ 2 2Þ=6 ¼ 1:33: The seller s2 combines two items of products in one package, so the average item per package is two. Seller s3 has offered four packages of three items, so the average items per package for s3 is three. And, there are ten buyers participating in the group buying shown in Table 25.5.

306

A. Sukstrienwong

Table 25.4 The price list example for the simulation Sellers Package numbers Product types A s1

s2

s3

package11 package12 package13 package14 package15 package16 package25 package26 package25 package26 package31

pack – – – pack – – pack pack – pack

package32 package33 package34

– pack of 1 pack of 1

of 1

of 1

of 1 of 1 of 1

Price ($)

B

C

D

– pack – – pack pack – – – pack pack

– – pack – – pack pack pack – – pack

– – – pack – – pack – pack pack –

of 1

of 1 of 1

of 1 of 1

pack of 1 – pack of 1

of 1

of 1 of 1 of 1

of 1

pack of 1 pack of 1 –

of 1

of 1 of 1 of 1

pack of 1 pack of 1 pack of 1

1,000 1,000 1,000 1,000 1,950 1,900 1,925 1,950 1,920 1,970 2,700 2,690 2,750 2,700

Table 25.5 Buyer orders bBuyers Buyer’s order (Number of items 9 (Reservation prices $)) b1 b2 b3 b4 b5 b6 b7 b8 b9 b10

A

B

C

– 1 9 (960.0) – 2 9 (969.0) – – – – – 1 9 (965.0)

– 1 9 (975.0) – – 1 9 (955.0) – – 4 9 (970.0) – –

1 – 1 – 1 – 2 – – –

D 9 (970.0) 9 (1000.0) 9 (960.00) 9 (980.0)

– – – – – 1 9 (980.00) – – 1 9 (989.0) –

25.4.2 The GroupBuyACO Algorithm Performance The first two parameters to be studies are a and b. As shown in Eq. 25.7, these parameters are related to the probability of selecting i units of packages jth (pkij ) because a is the exponent of Dskij and b is the exponent of gij . Thus the corresponding variations in the values of both a and b might play an importance role on the GroupBuyACO algorithm. Let both a and b value range from 0.5 to 3, and the number of iterations is 200. The resulting of corresponding variation in the values of a and b is shown in Table 25.6. The best result is shown in bold. It can be seen

25

Buyer Coalition Formation with Bundle of Items by ACO

Table 25.6 The average of group’s utility derived from corresponding in the values of a and b, iteration number = 2,000

a 0 0.5 1 2 3

307

b 0.5

1

2

3

759.06 755.25 623.57 927.24 554.65

457.22 594.23 757.48 907.09 657.84

791.33 814.71 698.21 456.98 569.24

673.21 734.72 542.01 671.45 459.27

Fig. 25.2 Number of iterations where initial settings a ¼ 2, b ¼ 0:5, and q ¼ 0:1

Table 25.7 The comparison of GroupBuyACO algorithm with the genetic algorithm

GroupBuyACO algorithm ($)

GroupPackageString ($)

927.11

909.74

that the average utility of the group earned by GroupBuyACO algorithm was high when a ¼ 2 and b ¼ 0:5. Evaporation rate q of the pheromone is one of the most important variables for the GroupBuyACO algorithm. From Fig. 25.2, it can be seen that when the value of q is approximately 0.1, the total utility earned from the group buying is the highest. The proposed algorithm compared with the GAGroupBuyer scheme by Sukstrienwong [16]. In order to evaluate the performance of GroupBuyACO, the default configuration of parameters were set to the following values: a ¼ 2, b ¼ 0:5 and q ¼ 0:1. From Table 25.7, the GroupBuyACO algorithm outperforms GroupPackageString.

25.5 Conclusions and Future Work In this paper, a new method for buyer coalition formation with bundle of items by ant colony optimization technique is proposed. The aim of the proposed algorithm is to form a buyer coalition in order to maximize the group’s total utility. The ants

308

A. Sukstrienwong

construct the trail by depositing pheromone after moving through a path and updating pheromone value associate with good or promising solutions through the edges of the path. From the experimental results, it is observed that the proposed algorithm is effective in dealing with finding best buyer coalitions with bundles of items. The solution quality of GroupBuyACO algorithm is shown by comparing with the genetic algorithm technique called GroupPackageString scheme. The experimental results show that the GroupBuyACO algorithm is able to yield better results than GAGroupBuyer scheme. However, the proposed algorithm has some restrictive constraints of forming a buyer coalition as follow: (1) all buyers quote specific prices for their requested products after they have seen the price list of all packages provided by sellers. (2) The buyer coalition is formed concerning only the price attribute. (3) And, the price per item is a monotonically decreasing function when the size of the package is increasing big. These restrictions can be extended to investigate in future researches.

References 1. Alipour H, Khosrowshahi Asl E, Esmaeili M, Nourhosseini M (2008) ACO-FCR: Applying ACO-based algorithms to induct FCR, Lecture note in engineering and computer science: proceedings of the world congress on engineering 2008, 2–4 July, London, UK, pp 12–17 2. Dana J (2004) Buyer groups as strategic commitments mimeo. Northwestern University, USA 3. Dorigo M, Gambardella LM (1997) Ant colony system: a cooperative learning approach to the traveling salesman problem. IEEE Trans Evolut Comput 1(1):53–66 4. Dorigo M, Di Caro G (1999) The ant colony optimization metaheuristic. In: Corne D et al (eds) New ideas in optimization. McGraw Hill, London, pp 11–32 5. Goss S, Beckers R, Deneubourg JL, Aron S, Pasteels JM (1990) How trail laying and trail following can solve foraging problems for ant colonies. In: Hughes RN (ed) Behavioural mechanisms of food selection NATO-ASI Series, G 20. Springer, Berlin 6. Gurler U, Oztop S, Sen A (2009) Optimal bundle formation and pricing of two products with limited stock. J Int J Prod Econ, 7. Hölldobler B, Wilson EO (1990) The Ants. Springer, Berlin, p 732 8. Hyodo M, Matsuo T, Ito T (2003)An optimal coalition formation among buyer agents based on a genetic algorithm. In: 16th international conference on industrial and engineering applications of artificial intelligence and expert systems (IEA/AIE’03), Laughborough, UK, pp 759–767 9. Ismail M, Nur Hazima FI, Mohd. Rozely K, Muhammad Khayat I, Titik Khawa AR, Mohd Rafi A (2008) Ant colony optimization (ACO) technique in economic power dispatch problems. Lecture note in engineering and computer science: proceedings of the international multiconference of engineers and computer scientists, 19–21 March 2008, Hong Kong, pp 1387–1392 10. Ito T, Hiroyuki O, Toramatsu S (2002) A group buy protocal based on coalition formation for agent-mediated e-commerce. IJCIS 3(1):11–20 11. Laor B, Leung HF, Boonjing V, Dickson KW (2009) Forming buyer coalitions with bundles of items. In: Nguyen NT, Hakansson A, Hartung R, Howlett R, Jain LC (eds.) KES-AMSTA 2009. LNAI 5559-0717 Springer, Heidelberg, pp 121–138 12. Lawler EL, Lenstra JK, Rinnooy-Kan AHG, Shmoys DB (eds) (1985) The traveling salesman problem. Wiley, New York

25

Buyer Coalition Formation with Bundle of Items by ACO

309

13. Li C, Sycara K (2007) Algorithm for combinatorial coalition formation and payoff diversion in an electronic marketplace. In: Proceedings of the first international joint conference on autonomous agents and multiagent systems, pp 120–127 14. Mahdi S (2007) Negotiating agents in e-commerce based on a combined strategy using genetic algorithms as well as fuzzy fairness function. In: Proceedings of the world congress on engineering, WCE 2007, vol I. 2–4 July 2007, London, UK 15. Maniezzo V, Colorni A, Dorigo M (1994) The ant system applied to the quadratic assignment problem. Université Libre de Bruxelles, Belgium, Tech. Rep. IRIDIA/94-28 16. Sukstrienwong A (2010), Buyer formation with bundle of items in e-marketplaces by genetic algorithm. Lecture note in engineering and computer science: proceedings of the international multiconference of engineers and computer scientists 2010, IMECS 2010, 17–19 March 2010, Hong Kong, pp 158–162 17. Sukstrienwong A (2010) Ant colony optimization for buyer coalition with bundle of items. Lecture notes in engineering and computer science: proceedings of the world congress on engineering 2010, WCE 2010, 30 June–2 July, London, UK, pp 38–43 18. Tsvetovat M, Sycara KP, Chen Y, Ying J (2001)Customer coalitions in electronic markets. Lecture notes in computer science, vol 2003. Springer, Heidelberg, pp 121–138 19. Yamada T, Reeves CR (1998) Solving the Csum permutation flowshop scheduling problem by genetic local search. In: Proceedings of 1998 ieee international conference on evolutionary computation, pp 230–234 20. Yamamoto J, Sycara K (2001) A stable and efficient buyer coalition formation scheme for e-marketplaces. In: Proceedings of the 5th international conference on autonomous agents, Monttreal, Quebec, Canada, pp 576–583

Chapter 26

Coevolutionary Grammatical Evolution for Building Trading Algorithms Kamal Adamu and Steve Phelps

Abstract Advancements in communications and computer technology has enabled traders to program their trading strategies into computer programs (trading algorithms) that submit electronic orders to an exchange automatically. The work in this chapter entails the use of a coevolutionary algorithm based on grammatical evolution to produce trading algorithms. The trading algorithms developed are benchmarked against a publicly available trading system called the turtle trading system (TTS). The results suggest that out framework is capable of producing trading algorithms that outperform the TTS. In addition, a comparison between trading algorithms developed under a utilitarian framework, and using Sharpe ratio as objective function shows that they have statistically different performance.

26.1 Introduction Traders make trade decisions specifying entry, exit, and stop loss prices [1]. The entry is the price at which the trader wishes to enter the market, the exit is the price at which the trader expects to take profit, and the stop loss is the price at which the trader wants to exit a position when a trade is not in her favour [1]. A set of entry, exit, and stop loss rules is referred to as a trading system and

K. Adamu (&) S. Phelps Center for Computational Finance and Economic Agents, University of Essex, Colchester, UK e-mail: [email protected] S. Phelps e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_26, Springer Science+Business Media B.V. 2011

311

312

K. Adamu and S. Phelps

there exists an interdependency between these rules [1]. A trader that consistently fails to exit a loosing trade when they have incurred a tolerable amount of loss will almost certainly be wiped out after a couple of loosing trades. Moreover, a trader that takes profit too early or too late before making a required amount of profit will have very little to cover their costs and loss or lose part of the profit she has made [1]. Technicians decide on entry, exit, and stop loss prices based on technical trading rules [1]. Advancements in communication and computer technology has allowed traders to submit trades electronically using computer programs (Trading algorithms) that sift through a vast amount of information looking for trade opportunities [2]. Trading algorithms have gained popularity due to their cost effective nature [2]. According to Hendershott et al. [2] 75% of trades executed in the US in 2009 were by trading algorithms. The aim of the work in this chapter is to test if a methodology based on grammatical evolution (GE) [3] can be used to coevolve rules for entry, exit, and stop loss that outperform a publicly available trading system called the turtle trading system in high frequency [1, 4]. This chapter also tests if trading algorithms developed under a utilitarian framework have the similar performance as trading algorithms developed using the Sharpe ratio as objective function. Adamu and Phelps, Saks and Maringer [5, 6] employ cooperative coevolution in developing technical trading rules. In this chapter, we coevolve rules that form trading algorithms using GE for high frequency trading. The trading algorithms evolved are benchmarked against the turtle trading system. In addition, the effect of various objective functions on the trading algorithms evolved is considered. The rest of this chapter is organised as follows. Section 26.2 gives a survey on the turtle trading system, and investor preference . We explain our framework in Sect. 26.3 and present the data used for the study in Sect. 26.4 Our result is presented in Sect. 26.5 and the Chapter ends with a summary in Sect. 26.6.

26.2 Background 26.2.1 The Turtle Trading System A trading system is a set of rules that signal when to enter, and exit a position where a position is a steak in a particular asset in a particular market [1]. The rules in trading systems specify when to enter the market when prices are expected to fall, when to enter the market when prices are expected to rise, and how to minimise loss and maximise profit (Money management) [1]. The entry rules for the turtle trading system are specified as follows [1, 4]:

26

Coevolutionary Grammatical Evolution

313

Ht ; t 2 f1; 2; 3; 4; . . .; T g is the current highest price, and Lt ; t 2 f1; 2; 3; 4; . . .; T g is the current lowest price. The TTS places the initial stop loss at entry using the following equation: Stopt1 2ATRt if Long ð26:1Þ Stopt ¼ Stopt1 þ 2ATRt if short: where ATRt is the current average true range and it is calculated as follows: ATRt ¼

19Nt1 TRt 20

ð26:2Þ

TRt ; t 2 f1; 2; 3; 4; . . .; T g is the true range and its calculated as follows: TRt ¼ maxðHt Lt ; Ht Ct1 ; Ct1 Lt Þ

ð26:3Þ

Ct ; t 2 f1; 2; 3; 4; . . .; T g is the price at the end of the time interval t; t 2 f1; 2; 3; . . .; T g.

26.2.2 Investor Preference 26.2.2.1 Utility Theory Traditional finance and economics postulates that the financial markets are populated by rational, risk averse agents that prefer more wealth to less wealth [7]. One of the corner stones of efficient markets is the presence of the homo econimucus, the rational risk averse agent with preference for more wealth than less wealth [6, 7] . In utilitarian terms this translates to expected utility 0 00 maximising investors with utility functions that satisfy UðWÞ [ 0 and UðWÞ \0 where U(W) is the utility of wealth, and W is the current level of wealth [7]. The power utility function, negative exponential utility function, and quadratic utility 0 00 function satisfy UðWÞ [ 0 and UðWÞ \0.

314

K. Adamu and S. Phelps

The Power utility function is defined by the following equation [7] UðWÞ ¼

W 1c ; 1c

c [ 0;

c 6¼ 1

ð26:4Þ

c controls the degree of risk aversion of the utility function. There is evidence to suggest that traders exhibit loss aversion as well as risk aversion [6] hence in this chapter, a modified wealth is used to account for loss aversion [6]. The power utility function (PUF) is then defined as follows [6]: UðWi Þ ¼ Wi ¼

Wi 1c 1 ; 1c 1c

W0 ð1 þ vi Þ w0 ð1 þ vi Þk

c[1

ð26:5Þ

vi [ 0 vi \0; k [ 1

ð26:6Þ

where vi is the simple return for trade interval i; i 2 f1; 2; . . .; Ng; Wi is a modified level of wealth for the given trade interval i; i 2 f1; 2; . . .; Ng. For this study we consider the case of a unit investor and set the initial level of wealth W0 ¼ 1: k, and c define the risk, and loss preference of the agents respectively. Quadratic utility function. The quadratic utility function (QUF) is given by the following equation [7]: b UðWi Þ ¼ Wi Wi2 ; 2

ð26:7Þ

b[0 0

where W is the wealth. To satisfy the condition of UðWÞ [ 0, we set W ¼ 1b for levels of wealth W [ 1b. The Negative exponential utility function is given by the following equation [7]: UðWi Þ ¼ a becWi ;

c[0

ð26:8Þ

26.2.2.2 Sharpe Ratio It follows that, provided a utility function satisfies UðWÞ0 [ 0 and UðWÞ00 \0 then it suffices to look at the mean and variance of the outcome of investments, regardless of the distribution of the outcome of investments [8]. One measure that takes mean of return, and standard deviation of return into account is the Sharpe ratio [8]. The Sharpe ratio is defined by the following formula [8]: lr rf rr

ð26:9Þ

rf is the risk free rate of interest (this is negligible in high frequency), and lr and rr are the mean and standard deviation of return respectively. A high Sharpe ratio implies a high mean return per unit risk and vice-versa.

26

Coevolutionary Grammatical Evolution

315

26.3 Framework Our framework develops trading algorithms of the form:

Our framework coevolves the entry, exit, and stoploss rules for long and short positions respectively. Each set of rule is a species on its own. We denote the species of entry, exit, and stop loss rules for long positions as EkL ; k 2 f1; 2; 3; . . .; Ng, CkL ; k 2 f1; 2; 3; . . .; Ng, and SLk ; k 2 f1; 2; 3; . . .; Ng respectively. EkS ; k 2 f1; 2; 3; . . .; Ng, CkS ; k 2 f1; 2; 3; . . .; Ng, and SSk ; k 2 f1; 2; 3; . . .; Ng is the notation for entry,exit, and stop loss rules for short positions. The transition table for the algorithm given above is in Table 26.1. GE is used to evolve species within each population. Sexual reproduction is interspecies and we employ an implicit speciation technique within each species [9]. Rules are spatially distributed on notional toroid and sexually reproduce with rules within their deme [9]. This is akin to individuals sharing information with individuals within their social circle. The deme of a rule k is a set of individuals within the immediate vicinity of rule k on the imaginary toroid. Collaborators are chosen from other species based on an elitist principle [10]. For instance, when assessing a solution from the set EkL ; k 2 f1; 2; 3; . . .; Ng, the best from CkL ; k 2 f1; 2; 3; . . .; Ng, SLk ; k 2 f1; 2; 3; . . .; Ng, EkS ; k 2 f1; 2; 3; . . .; Ng, CkS ; k 2 f1; 2; 3; . . .; Ng, and SSk ; k 2 f1; 2; 3; . . .; Ng are chosen for collaboration and the fitness attained is

316

K. Adamu and S. Phelps

Table 26.1 Transition table for entry, exit, and stop loss rules Current EkL CkL SLk EkS CkS position

SSk

Action

Long Long

X X

0 0

0 1

X X

X X

X X

Long

X

1

0

X

X

X

Long

X

1

1

X

X

X

Short Short

X X

X X

X X

X X

0 0

0 1

Short

X

X

X

X

1

0

Short

X

X

X

X

1

1

Neutral Neutral

0 0

X X

X X

0 1

X X

X X

Neutral

1

X

X

0

X

X

Neutral

1

X

X

1

X

X

Do nothing Close long position Close long position Close long position Do nothing Close short position Close short position Close short position Do nothing Open short position Open long position Donothing

X stands for ignore

assigned to rule k. The platform for collaboration and fitness evaluation is a trading algorithm. Each species asserts evolutionary pressure on the other and rules that contribute to the profitability of the trading algorithm attain high fitness and survive to pass down their genetic material to their offspring. On the other hand, rules that do not contribute are awarded low fitness and are eventually replaced by solutions with higher fitness. Selection occurs at the population level such that for each species a tournament is performed and if the fitness of a rule is less than the fitness of its offspring then it is replaced by its offspring. This can be expressed formally using the following equation: y If fy [ fx ð26:10Þ x¼ x otherwise

26.3.1 Objective Function The following assumptions are implicit in the fitness evaluation: 1. Only one position can be traded at any instant. 2. Only one unit can be traded at any instant. 3. The is no market friction (zero transaction cost, zero slippage, zero market impact). Arguably, since only one unit is traded at any instant, the effect of market impact can be considered to be negligible.

26

Coevolutionary Grammatical Evolution

317

26.3.1.1 Sharpe Ratio The Sharpe ratio is computed using Eq. 26.10. The objective is then to maximise: lk max r ð26:11Þ rr k lkr , and rkr are the mean and standard deviation of trading algorithm k; k 2 f1; 2; 3; . . .; Pg:

26.3.1.2 Expected Utility The objective function when using utility functions is the expected utility which is calculated as follows: f ¼ EðUðWÞÞ ¼

30 X N 1X UðWi ; hj Þ N j¼1 i¼1

ð26:12Þ

where UðWi ; hj Þ is the utility of wealth at interval i; ı 2 f1; 2; 3; . . .; Ng given the vector of parameters for the utility function hj , and N is the number of trading intervals. The utility for each interval is calculated for a range of parameter values (see Sect. 26.2 for parameter settings). The objective in the utilitarian framework can be formally expressed as follows: max EðUðWi ; hj ÞÞ; i 2 f1; 2; 3; . . .Ng hj

ð26:13Þ

26.3.2 Parameter Settings The population size P of each species is set to 100 and the coevolutionary process is allowed to happen for a maximum number of generations, MaxGen = 200. The coevolutionary process is terminated after MaxGen/2 generations, if there is no improvement in the fitness of the elitist (best solution) of the best solutions from each species. The deme size for each species is set to 11. The grammar used in mapping the entry, and exit rules of the trading algorithms is shown in Table 26.2. The grammar used in mapping the stop loss rules of the trading systems is shown in Table 26.3. In our notation, O(t-n:t-1) represents a set of open prices, C(t-n:t-1) represents a set of closing prices, H(t-n,-1) represents a set of highest prices, and L(t-n:t-1) represents a set of lowest prices between t-n and t-1. O(t-n) represents the open price at t-n, C(t-n) represents the closing price at t-n, H(t-n) represents the highest price at t-n, and L(t-n) represents the lowest price at t-n. Where n 2 f10; 11; 12; . . .; 99g and t 2 f1; 2; . . .g. sma(•), and ema(•) stand for simple, and exponential moving average respectively.

318

K. Adamu and S. Phelps

Table 26.2 Grammar for mapping EkL , EkS , CkL , and CkS / Rule \expr [ :: \rule [ :: \binop [ :: \var [ :: \op [ :: \window [ :: \integer [ :: \fun [ ::

\binop [ ð\expr [ ; \expr [ Þ \rule [ \var [ \op [ \var [ \fun [ \op [ \fun [ and, or, xor H(t-\window [ ) O(t-\window [ ) [ ; \; ¼; ; ; \integer [ \integer [ 1, 2, 3, 4, 5, 6, 7, 8, 9 sma(H(t-\window [ :t-1)) max(H(t-\window [ :t-1)) sma(L(t-\window [ :t-1)) max(L(t-\window [ :t-1)) sma(O(t-\window [ :t-1)) max(O(t-\window [ :t-1)) sma(C(t-\window [ :t-1)) max(C(t-\window [ :t-1))

n (2) \var [ \op [ \fun [ (3) (3) L(t-\window [ ) C(t-\window [ ) (4) (1) (9) ema(H(t-\window [ :t-1)) min(H(t-\window [ :t-1)) ema(L(t-\window [ :t-1)) min(L(t-\window [ :t-1)) ema(O(t-\window [ :t-1)) min(O(t-\window [ :t-1)) ema(C(t-\window [ :t-1)) min(C(t-\window [ :t-1))

(15)

/ is the set of non-terminals, and n is the n is the number of rules for mapping the non-terminal /

Table 26.3 Grammar for mapping SLk , and SSk / Rule \expr [ :: \rule [ ::

\preop [ :: \var [ ::

\fun [ ::

\window [ :: \integer [ ::

\preop [ (\expr [ ,\expr [ ) \rule [ \rule [ \op [ \rule [ \var [ \op [ \fun [ \fun [ min, max H(t-\window [ ) L(t-\window [ ) O(t-\window [ ) C(t-\window [ ) sma(H(t-\window [ :t-1)) max(H(t-\window [ :t-1)) sma(L(t-\window [ :t-1)) max(L(t-\window [ :t-1)) sma(O(t-\window [ :t-1)) max(O(t-\window [ :t-1)) sma(C(t-\window [ :t-1)) max(C(t-\window [ :t-1)) \integer [ \integer [ 1, 2, 3, 4, 5, 6, 7, 8, 9

n (2) \var [ \op [ \var [ \fun [ \op [ \fun [ \var [

(6) (2)

(4) ema(H(t-\window [ :t-1)) min(H(t-\window [ :t-1)) ema(L(t-\window [ :t-1)) min(L(t-\window [ :t-1)) ema(O(t-\window [ :t-1)) min(O(t-\window [ :t-1)) ema(C(t-\window [ :t-1)) min(C(t-\window [ :t-1))

/ is the set of non terminals, and n is the number of rules for mapping the non-terminal /

(15) (1) (9)

26

Coevolutionary Grammatical Evolution

319

Fig. 26.1 Average out-ofsample Sharpe ratio of trading systems produced under the assumption of PUF, NEUF, QUF, and sharpe ratio fitness functions

The parameters of the power utility function (PUF), k, and c are sampled within the following ranges. 1\k\2, and 1\c\35 where k controls the degree of loss aversion and c controls the degree of risk aversion. The parameters of the negative exponential utility function (NEUF), a, b, and c were sampled from the following ranges 1\a\35; 1\b\35, and 1\c\35. The parameter for the quadratic utility function (QUF) was sampled from the following ranges 1\b\35.

26.4 Data In this chapter, we use high frequency tick data for Amvesco for the period between 1 March 2007 and 1 April 2007 for our study. The data was compressed into a series of five minutely high, low, open, close prices proxy. The data was then divided into four blocks for k-fold cross validation [11].

26.5 Results and Discussion In this section, we present the results obtained from producing trading algorithms under the assumption of power utility function (PUF), negative exponential utility function (NEUF), quadratic utility function (QUF), and Sharpe ratio (SR) as objective function. Utility is not comparable across different utility functions hence, analysis is performed directly on the returns obtained by the trading algorithms. We take the average of the performance of the trading algorithms across different blocks in accordance with k-fold cross validation. Furthermore, the trading algorithms developed are compared to the turtle trading systems (TTS) (see Sect. 26.2.1. The comparison will serve as a test for the hypothesis that trading systems developed using our framework are able to outperform the turtle trading system. In addition, the trading algorithms developed are compared to a set of randomly initialised trading algorithms (MC). The randomly initialised trading algorithms

320

K. Adamu and S. Phelps

Table 26.4 Kruskal-Walliss ANOVA test results for out-of-sample average Sharpe ratios of agents produced under assumption of PUF, NEUF, QUF, and Sharpe ratio as fitness functions p-value v2 85.640

0.000

Table 26.5 Kruskal-Walliss ANOVA test results for the null hypothesis that the out-of-sample Sharpe ratios of agents produced under assumption of PUF, NEUF, QUF, and Sharpe ratio as fitness functions is the same as a set of random strategies (MC) Objective function PUF NEUF QUF SR v2 p-value

19.960 0.000

1.400 0.237

3.110 0.078

8.340 0.040

were mapped using the grammar used to coevolve our trading algorithms. The comparison will test if the performance of the trading systems can be reproduced by chance. Figure 26.1 depicts the cumulative distribution function of the average Sharpe ratios of trading algorithms produced under the assumption of PUF, NEUF, QUF, and Sharpe ratio as fitness functions. Figure 26.1 suggests that, given the assumption of no budget constraints and frictionless markets, trading systems produced under the assumption of PUF, and SR have a better reward to risk ratio (Sharpe ratio) for the data set considered. A kruskal-Walliss ANOVA test for the null hypothesis that the Sharpe ratios of agents produced under assumption PUF, NEUF, QUF, and SR as fitness functions are the same was performed and the test results are shown in Table 26.4. The results in Table 26.4 show that, trading systems produced under the assumption of PUF, NEUF, QUF, and SR produce Sharpe ratios that are statistically different from each other. The results in Table 26.4 support the results in Fig. 26.1. Traditional investment theory postulates that provided investors have utility functions that satisfy the assumption of risk aversion and non-satiation then irrespective of their utility functions the mean and standard deviation of the outcomes of their investments are enough to summarise the outcomes of the distribution of outcomes. All the utility functions employed in this chapter satisfy the assumption of risk aversion and non-satiation. The results in Fig. 26.1; however, show that there is a difference between trading systems developed using different utility functions, and trading systems developed using the Sharpe ratio. Table 26.5 shows results from the Kruskal-Wallis ANOVA test for the null hypothesis that, the Sharpe ratios of the agents produced under the assumption of PUF, NEUF, QUF, and SR as objective function are not different from a set of randoml strategies (MC). The results in Table 26.5 suggests that trading systems produced under the assumption of PUF, QUF, and SR produce Sharpe ratios that are significantly different from a set of randomly initialised strategies. This implies performance of these trading systems is highly unlikely to have resulted out of pure chance. To test the hypothesis that, our framework can be used to produce trading algorithms that outperform the turtle trading system, the performance of the

26

Coevolutionary Grammatical Evolution

321

Table 26.6 Sign test results for the null hypothesis that the out-of-sample Sharpe ratios of trading systems produced under assumption PUF, NEUF, QUF, and SR as fitness functions come from a continuous distribution with a median that is same as the Sharpe ratio of the TTS Objective z-value sign pfunction value PUF NEUF QUF SR

1.839 -6.418 -4.000 -6.647

18 1 10 1

0.000 0.000 0.000 0.000

trading systems developed is compared to the performance of the turtle trading system using a sign test. Table 26.6 contains results from a sign-test for the null hypothesis that average out-of-sample Sharpe ratios of the trading systems developed under assumption of power utility function (PUF), negative exponential utility function (NEUF), quadratic utility function (QUF), and Sharpe ratio (SR) as objective function are not any different from the turtle trading system (TTS). The results in Table 26.6 suggests that trading systems produced under the assumption of PUF as objective function produced Sharpe ratios that are significantly better than the Sharpe ratio of the TTS.

26.6 Chapter Summary Advancements in communication and computer technology has allowed trading systems to be programmed into computer programs that execute orders and this has gained a lot of popularity [2]. In this chapter, we introduced a method based on GE for coevolving technical trading rules for high frequency trading (see Sect. 26.3 for the method). Our results suggests our framework is capable of producing trading algorithms that outperform the turtle trading system under no budget constraint when using power utility function as objective function. The results in this chapter show that there is a significant difference between the performance of trading systems that were produced under the assumption of PUF, NEUF, QUF, and Sharpe ratio as objective function. This suggests that coevolutionary approach is highly sensitive to the objective function chosen.

References 1. Faith C (2003) The original turtle trading rules. http://www.originalturtle.org 2. Hendershott T, Jones CM, Menkveld AJ (2011) Does algorithmic trading improve liquidity. J Finance 66(1):1–33 3. O’Neill M, Brabazon A, Ryan C, Collins JJ (2001) Evolving market index trading tules using grammatical evolution. Appl Evolut Comput, Lect Notes Comput Sci 2037(2001):343–352

322

K. Adamu and S. Phelps

4. Anderson JA (2003) Taking a peek inside the turtle’s shell. School of Economics and Finance, Queensland University of Technology, Australia 5. Adamu K, Phelps S (2010) Coevolution of technical trading rules for high frequency trading. Lecture notes in computer science and engineering, proceedings of the world congress on engineering, WCE 2010(1):96–101 6. Saks P, Maringer D (2009) Evolutionary money management. Lecture notes in computer science, Applications of evolutionary computing, vol 5484(2009). Springer, Heidelberg pp 162–171 7. Cuthbertson K, Nitzsche D (2004) Quantitative financial economics. 2nd edn. Wiley, Chichester pp 13–32 (chapter 1) 8. Amman H, Rusten B, (eds) (2005) Portfolio management with heuristic optimization. Advances in computational management sciences, vol 8. Springer, Berlin pp 1–37 (Chapter1) 9. Eiben AE, Smith JE (2003) Introduction to evolutionary computing. Springer, Berlin (Chapter 9) 10. Wiegand R, Paul C, Liles W, JongKenneth A De (2001) An empirical analysis of collaboration methods in cooperative coevolutionary algorithms. In: Proceedings of the genetic and evolutionary computation conference, Morgan Kaufmann Publishers 11. Kohavi R (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. Int Joint Conf Artif Intell 14(2):1137–1145

Chapter 27

High Performance Computing Applied to the False Nearest Neighbors Method: Box-Assisted and kd-Tree Approaches Julio J. Águila, Ismael Marín, Enrique Arias, María del Mar Artigao and Juan J. Miralles

Abstract In different fields of science and engineering (medicine, economics, oceanography, biological systems, etc.) the false nearest neighbors (FNN) method has a special relevance. In some of these applications, it is important to provide the results in a reasonable time scale, thus the execution time of the FNN method has to be reduced. To achieve this goal, a multidisciplinary group formed by computer scientists and physicists are collaborative working on developing High Performance Computing implementations of one of the most popular algorithms that implement the FNN method: based on box-assisted algorithm and based on kd-tree data structure. In this paper, a comparative study of the distributed memory architecture implementations carried out in the framework of this collaboration is

J. J. Águila (&) E. Arias Albacete Research Institute of Informatics, University of Castilla-La Mancha, Avda. España s/n, 02071 Albacete, Spain e-mail: [email protected] E. Arias e-mail: [email protected] I. Marín M. del Mar Artigao J. J. Miralles Applied Physics Department, University of Castilla-La Mancha, Avda. España s/n, 02071 Albacete, Spain e-mail: [email protected] M. del Mar Artigao e-mail: [email protected] J. J. Miralles e-mail: [email protected] J. J. Águila Depto. Ingeniería en Computación, Universidad de Magallanes, Avda. Bulnes, 01855 Punta Arenas, Chile

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_27, Springer Science+Business Media B.V. 2011

323

324

J. J. Águila et al.

presented. As a result, two parallel implementations for box-assisted algorithm and one parallel implementation for the kd-tree structure are compared in terms of execution time, speed-up and efficiency. In terms of execution time, the approaches presented here are from 2 to 16 times faster than the sequential implementation, and the kd-tree approach is from 3 to 7 times faster than the boxassisted approaches.

27.1 Introduction In nonlinear time series analysis the false nearest neighbors (FNN) method is crucial to the success of the subsequent analysis. Many fields of science and engineering use the results obtained with this method. But the complexity and size of the time series increase day to day and it is important to provide the results in a reasonable time scale. For example, in the case of electrocardiogram study (ECG), this method have to achieve real-time performance in order to take some prevention actions. With the development of the parallel computing, large amounts of processing power and memory capacity are available to solve the gap between size and time. The FNN method was introduced by Kennel et al. [1]. Let X ¼ fxðiÞ : 0 i\ng a time series. We can construct points (delay vectors) according to yðiÞ ¼ ½xðiÞ; xði þ sÞ; . . .; xði þ ðd 1ÞsÞ

ð27:1Þ

where s is the embedding delay and d is the embedding dimension [2]. The Takens embedding theorem [3] states that for a large enough embedding dimension d m0 ; the delay vectors yield a phase space that has exactly the same properties as the one formed by the original variables of the system. The FNN method is a tool for determining the minimal embedding dimension m0 : Working in any dimension larger than the minimum leads to excessive computation when investigating any subsequent question (Lyapunov exponents, prediction, etc.). The method identifies the nearest neighbor yðjÞ for each point yðiÞ: According to Eq. 27.2, if the normalized distance is larger than a given threshold Rtr ; then the point yðiÞ is marked as having a false nearest neighbor. jxði þ dsÞ xðj þ dsÞj [ Rtr jjyðiÞ yðjÞjj

ð27:2Þ

Equation 27.2 has to be calculated for the whole time series and for several dimensions d ¼ f1; 2; . . .; mg until the fraction of points, which must be lower than Rtr ; is zero, or at least sufficiently small (in practice, lower than 1%). While greater is the value of n (length of the time series), the task to find the nearest neighbor for each point is more computationally expensive. A review of

27

High Performance Computing Applied to FNN Method

325

methods to find nearest neighbors, which are particularly useful for the study of time series data, can be found in [4]. We focused in two approaches: based on the box-assisted algorithm, optimized in the context of time series analysis by [5]; and the based in a kd-tree data structure [6, 7] developed in the context of computational geometry. According to Schreiber, for time series that have a low dimension of embedding (e.g. up to the 10’s), the box-assisted algorithm is particularly efficient. This algorithm can offer a lower complexity of OðnÞ under certain conditions. By the other hand, accordingly with the literature if the dimension of embedding is moderate an effective method for nearest neighbors searching consists in using a kd-tree data structure [6, 7]. From the computational theory point of view, the kd-tree-based algorithm has the advantage of providing an asymptotic number of operations proportional to Oðn log nÞ for a set of n points, which is the best possible performance for arbitrary distribution of elements. We have applied the paradigm of parallel computing to implement three approaches directed towards distributed memory architectures, in order to make a comparative study between the method based on the box-assisted algorithm and the method based on the kd-tree data structure. The results are presented in terms of performance metrics for parallel systems, that is, execution time, speed-up and efficiency. Two case studies have been considered to carried out this comparative study. A theoretical case study which consists on a Lorenz model, and a real case study which consists on a time series belonging to electrocardiography. The paper is organized as follows. After this introduction, a description of the considered approaches is introduced in Sect. 27.2. In Sect. 27.3, the experimental results are presented. Finally, in Sect. 27.4 some conclusions and future work are outlined.

27.2 Parallel Approaches We selected two programs to start this work: the false_nearest program based on the box-assisted algorithm [8, 9]; and the fnn program based on a kd-tree data structure [10]. We employ the paradigm Single-Program, Multiple Data (SPMD by [11]) to design the three parallel approaches. A coarse-grained decomposition [12] has been considered, i.e. we have a small number of tasks in parallel with a large amount of computations. The approaches are directed towards distributed memory architectures using the Message Passing Interface [13] standard for communication purpose. Two approaches are based on the box-assisted algorithm and the another approach is based on the kd-tree data structure.

326

J. J. Águila et al.

27.2.1 Approaches Based on Box-Assisted Algorithm The box-assisted algorithm [5] considers a set of n points yðiÞ in k dimensions. The idea of the method is as follow. Divide the phase space into a grid of boxes of side length : Each point yðiÞ lies into one of these boxes. The nearest neighbors there are located in the same box or in one of the adjacent boxes. The false_nearest program is a sequential implementation of the FNN method based on this algorithm. By profiling the false_nearest program in order to carry out the parallel approaches, four tasks were identified. Let X a time series, Y a set of points constructed according to Eq. 27.1, BOX an array that implements the grid of boxes (or mesh), and p the number of processes. Two parallel implementations were formed based on these four tasks: Domain decomposition Time series X is distributed to the processes. Two ways of distribution have been developed: Time Series (TS) and Mesh (M). In a TS data distribution the time series is split into p uniform parts of length np ; being n the length of the time series. In a M data distribution, each process computes the points that lie in its range of rows. The range of the mesh rows is assigned by ps ; where s is the size of the BOX. Grid construction The BOX array is filled. Two ways of grid construction have been developed: S (Sequential) and P (Parallel). In a S construction each process fills the BOX sequentially, thus each one has a copy. In a P construction each process fills a part of the group of boxes located over a set of assigned mesh rows. Nearest neighbors search Each process solves their subproblems given the domain decomposition way. In a TS data distribution each process uses the same group of points Y: In a M data distribution each process can use different groups of points. Communication of results Processes use MPI to synchronize the grid construction and to communicate the partial results at the end of each dimension. The approaches were called following the next nomenclature: DM-P-M meaning a Distributed Memory implementation considering that the grid construction is in Parallel and the time series is distributed according to the Mesh; DM-S-TS meaning a Distributed Memory implementation considering that the grid construction is Sequential and the Time Series is uniformly distributed to the processes. We have introduced MPI functions into the source codes to obtain the programs that can be run into a distributed memory platform. The most important MPI functions used in these programs are as follows: • MPI_Reduce Combines values provided from a group of MPI processes and returns the combined value in the MASTER process. • MPI_Allreduce Same as MPI_Reduce except that the result appears in all the MPI processes.

27

High Performance Computing Applied to FNN Method

327

Let p the total number of MPI processes. Each process has an identifier q ¼ f0; 1; . . .; p 1g: The process q ¼ 0 is treated as MASTER and processes with q 6¼ 0 are treated as slaves. The next algorithm depicts the algorithmic notation for the DM-P-M approach:

The next algorithm depicts the algorithmic notation for the DM-S-TS approach:

328

J. J. Águila et al.

27.2.2 Approach Based on the kd-Tree Data Structure A kd-tree data structure [6, 7] considers a set of n points yðiÞ in k dimensions. This tree is a k-dimensional binary search tree that represents a set of points in a k-dimensional space. The variant described in Friedman et al, distinguishes between two kinds of nodes: internal nodes partition the space by a cut plane defined by a value of the k dimensions (the one containing a maximum spread), and external nodes (or buckets) store the points in the resulting hyperrectangles of the partition. The root of the tree represents the entire k-dimensional space. The fnn program is a sequential implementation of the FNN method based on this structure.

27

High Performance Computing Applied to FNN Method

329

fnn program has been also analyzed by means of a profile tool before making the parallel implementation, identifying five main tasks. Thus, let X a time series, n the length of the time series, Y a set of points constructed according to Eq. 27.1, KDTREE a data structure that implements the kd-tree, p the number of processes, and q ¼ f0; 1; . . .; p 1g a process identifier. For convenience we assume that p is a power of two. The parallel implementation called KD-TREE-P was formed based on these five tasks: Global kd-tree building The first log p levels of KDTREE are built. All processors perform the same task, thus each one has a copy of the global tree. The restriction n p2 is imposed to ensure that the first log p levels of the tree correspond to nonterminal nodes instead of buckets. Local kd-tree building The local KDTREE is built. In the level log p of the global tree are p nonterminal nodes. Each processor q builds a local kd-tree using the ðq þ 1Þth-node like root. The first log p levels are destroyed and KDTREE is pointed to local tree. Domain decomposition Time series X is distributed to the processes. The building strategy imposes a distribution over the time series. Thus, the time series is split according to the kd-tree algorithm and the expected value of items contained in each local tree is approximately np: Nearest neighbors search Each process solves their subproblems. Each process searches the nearest neighbors for all points in Y that are in the local KDTREE. Communication of results Processes use MPI to communicate theirs partial results at the end of whole dimensions. The master process collects all partial results and reduces them. The next algorithm depicts the algorithmic notation for the KD-TREE-P approach:

330

J. J. Águila et al.

27.3 Experimental Results In order to test the performance of the parallel implementations, we have considered two case studies: the Lorenz time series generated by the equations system described in [14]; the electrocardiogram (ECG) signal generated by a dynamical model introduced in [15]. The Lorenz system is a benchmark problem in nonlinear time series analysis and the ECG model is used for biomedical science and engineering [16]. The parallel implementations have been run in a supercomputer called GALGO, which belongs to the Albacete Research Institute of Informatics [17]. The parallel platform consists in a cluster of 64 machines. Each machine has two processors Intel Xeon E5450 3.0 GHz and 32 GB of RAM memory. Each processor has 4 cores with 6,144 KB of cache memory. The machines are running RedHat Enterprise version 5 and using an Infiniband interconnection network. The cluster is presented as an unique resource which is accessed through a front-end node. The results are presented in terms of performance metrics for parallel systems described in [12]: execution time Tp ; speed-up S and efficiency E: These metrics are defined as follows: • Execution time The serial runtime of a program is the time elapsed between the beginning and the end of its execution on a sequential computer. The parallel runtime is the time that elapses from the moment that a parallel computation starts to the moment that the last processing element finishes its execution. We denote the serial runtime by Ts and the parallel runtime by Tp : • Speed-up is a measure that captures the relative benefit of solving a problem in parallel. It is defined as the ratio of the time taken to solve a problem in a single processing to the time required to solve the same problem on a parallel computer with p identical processing elements. We denote speed-up by the symbol S: • Efficiency is a measure of the fraction of time for which a processing element is usefully employed; it is defined as the ratio of speed-up to the number of processing elements. We denote efficiency by the symbol E: Mathematically, it is given by E ¼ Sp:

27

High Performance Computing Applied to FNN Method

331

Table 27.1 Size of BOX for each value of p using a Lorenz time series

p

DM-P-M

DM-S-TS

1 2 4 8 16 32

8,192 4,096 2,048 2,048 2,048 2,048

8,192 4,096 4,096 4,096 2,048 2,048

Table 27.2 Size of BOX for each value of p using a ECG time series

p

DM-P-M

DM-S-TS

1 2 4 8 16 32

4,096 4,096 4,096 2,048 2,048 2,048

4,096 4,096 4,096 4,096 2,048 2,048

Let p the number of processors, the execution time of the approaches have been tested for p ¼ f1; 2; 4; 8; 16; 32g; where p ¼ 1 corresponds to the sequential version of the approaches. We used one million records of the time series to calculate the ten first embedding dimensions. We have obtained that the optimal time delay for Lorenz time series is s ¼ 7 and for ECG signal is s ¼ 5 using the mutual information method. In order to obtain the best runtime of the approaches based in a box-assisted algorithm we found the best size of BOX for each value of p (Tables 27.1 and 27.2). The size of BOX defines the number of rows and columns for the grid of boxes. The values for p ¼ 1 corresponds to the sequential version of the program false_nearest. We have run ten tests to obtain the median value of the execution time Tp : In total 360 tests were performed. The performance metrics results are shown in Figs. 27.1 and 27.2. Sequential kd-tree implementation shows a lower execution time than boxassisted approach, since the grid construction stage on box-assisted implementation in TISEAN is very expensive in terms of execution time. The behavior of the Lorenz case study and the ECG case study is quite similar. Notice that, according to Figs. 27.1b and 27.2b, it is possible to appreciate a superlinear speed-up for kd-tree implementation when p\8 and these performance decreases when p [ 8: The super-linear speed-up is explained due to the fact that the cache memory is better exploited and that when the tree is split less searches have to be done at each subtree. With respect to the lost of performance, this situation is produced due to different causes. The first one is that, evidently, the overhead due to communications increases. Also, the most important cause is that the sequential part of our implementation becomes every time more relevant with respect to the parallel one.

332 Fig. 27.1 Performance metrics for the Lorenz case study: a execution time; b speed-up; c efficiency

J. J. Águila et al.

(a)

320 DM-P-M DM-S-TS KD-TREE-P

280 Tp (seconds)

240 200 160 120 80 40 0 1

2 4 8 16 p (Number of MPI processes)

32

(b) 14 12

DM-P-M DM-S-TS KD-TREE-P

S = Ts / Tp

10 8 6 4 2 0 1

2

4 8 16 p (Number of MPI processes)

32

(c)

E = (S / p) x 100 (%)

200 DM-P-M DM-S-TS KD-TREE-P

160 120 80 40 0 1

2 4 8 16 p (Number of MPI processes)

32

Considering only the box-assisted implementations, DM-S-TS is the boxassisted approach that provides the best results for the Lorenz attractor and the ECG signal. The reason is the very best data distribution with regard to DM-P-M.

27

High Performance Computing Applied to FNN Method

(a) 700 DM-P-M DM-S-TS KD-TREE-P

600 Tp (seconds)

Fig. 27.2 Performance metrics for the ECG case study: a execution time; b speed-up; c efficiency

333

500 400 300 200 100 0 1

2 4 8 16 p (Number of MPI processes)

32

(b) 18 DM-P-M DM-S-TS KD-TREE-P

S = Ts / Tp

15 12 9 6 3 0 1

2

4 8 16 p (Number of MPI processes)

32

(c)

E = (S / p) x 100 (%)

180 DM-P-M DM-S-TS KD-TREE-P

150 120 90 60 30 0 1

2 4 8 16 p (Number of MPI processes)

32

However, the reconstruction of the mesh is not parallelized in DM-S-TS implementation. So, the sequential part makes the reduction of execution time less significant when more CPUs are used. However, as the execution time of find

334

J. J. Águila et al.

neighbors is increased (e.g. in larger times series data), this circumstance becomes very less important. For Lorenz attractor, the DM-S-TS implementation is around 1.8 faster than the sequential program when it uses 2 CPUs, and around 12 when it uses 32 CPUs. This means that the efficiency for 2 CPUs is around 92% and decreases to 37% when using 32 CPUs. For ECG signal, the best box-assisted parallel implementation achieves a speed-up of around 16 when it is run on 32 CPUs of GALGO. Moreover, the time saving is around 93% using 2 CPUs and 51% using 32 CPUs. Unlike previous case, the efficiency of best implementation decreases more slowly. An optimization of TISEAN has been used. It allows the best mesh size to be tuned for each case. In case of use original TISEAN (fixed mesh size), the reduction of execution time would be more important. According to the experimental results, kd-tree-based parallel implementation obtains the best performance than the box-assisted-based parallel implementation, almost in terms of execution time, for both case studies. Due to the spectacular execution time reduction provided by the kd-tree-based parallel implementation, the performance in terms of speed-up and efficiency seems to be worst, with respect to the other approaches.

27.4 Conclusions In this paper, a comparative study between the distributed memory implementations of two different ways to compute the FNN method have been presented, that is, the based on the box-assisted algorithm and the based on kd-tree data structure. To make this comparative study three different implementations have been developed: two implementations based on box-assisted algorithm, and one implementation based on kd-tree data structure. The most important metric to consider is how well the resulting implementations accelerate the compute of the minimal embedding dimension, which is the ultimate goal of the FNN method. In terms of the execution time, the parallel approaches are from 2 to 16 times faster than the sequential implementation, and the kd-tree approach is from 3 to 7 times faster than the box-assisted algorithm. With respect to the experimental results, the kd-tree-based parallel implementation provides the best performance in terms of execution time, reducing dramatically the execution time. As a consequence, the speed-up an efficiency are far from the ideal. However, it is necessary to deal with more case studies of special interest for the authors: wind speed, ozone, air temperature, etc. About related works, in the context of parallel implementations to compute FNN method, the work carried out by the authors could be considered as the first one. The authors are working also on considering shared memory implementations using Pthreads [18, 19] or OpenMP [20, 21], and hybrid MPI+Pthreads or MPI+OpenMP parallel implementations. Also, as a future work, the authors are considering to develop GPU-based parallel implementation of the algorithms considered in this paper.

27

High Performance Computing Applied to FNN Method

335

To sum-up, we hope that our program will be useful in applications of nonlinear techniques to analyze real time series as well as artificial time series. This work represents the first step of nonlinear time series analysis, that it is becomes meaningful when considering ulterior stages on the analysis as prediction, and when for some applications the time represents a crucial factor. Acknowledgments This work has been supported by National Projects CGL2007-66440-C04-03 and CGL2008-05688-C02-01/CLI. A short version was presented in [22]. In this version, we have introduced the algorithmic notation by the parallel implementations.

References 1. Kennel MB, Brown R, Abarbanel HDI (1992) Determining embedding dimension for phase space reconstruction using the method of false nearest neighbors. Phys Rev A 45(6): 3403–3411 2. Fraser AM, Swinney HL (1986) Independent coordinates for strange attractors from mutual information. Phys Rev A 33(2):1134–1140 3. Takens F (1981) Detecting strange attractors in turbulence. In: Rand DA, Young L-S (eds) Dynamical systems and turbulence, Warwick 1980. Springer, New York, pp 366–381 4. Schreiber T (1995) Efficient neighbor searching in nonlinear time series analysis. Int J Bifurcation Chaos 5:349 5. Grassberger P (1990) An optimized box-assisted algorithm for fractal dimensions. Phys Lett A 148(1–2):63–68 6. Bentley JL (1975) Multidimensional binary search trees used for associative searching. Commun ACM 18(9):509–517 7. Friedman JH, Bentley JL, Finkel RA (1977) An algorithm for finding best matches in logarithmic expected time. ACM Trans Math Software (TOMS) 3(3):209–226 8. Hegger R, Kantz H, Schreiber T (1999) Practical implementation of nonlinear time series methods: the TISEAN package. Chaos 9(2):413–435 9. Hegger R, Kantz H, Schreiber T (2007) Tisean: nonlinear time series analysis. http://www.mpipks-dresden.mpg.de/*tisean 10. Kennel MB (1993) Download page of fnn program ftp://lyapunov.ucsd.edu/pub/nonlinear/ fns.tgz 11. Darema F (2001) The spmd model: past, present and future. In: Lecture notes in computer science, pp 1–1 12. Grama A, Gupta A, Karypis G, Kumar V (2003) Introduction to parallel computing. AddisonWesley, New York 13. Message Passing Interface http://www.mcs.anl.gov/research/projects/mpi 14. Lorenz EN (1963) Deterministic nonperiodic flow. J Atmos Sci 20(2):130–141 15. McSharry PE, Clifford GD, Tarassenko L, Smith LA (2003) A dynamical model for generating synthetic electrocardiogram signals. IEEE Trans Biomedical Eng 50(3):289–294 16. ECGSYN (2003) Ecgsyn: a realistic ecg waveform generator. http://www.physionet.org/ physiotools/ecgsyn 17. Albacete Research Institute of Informatics, http://www.i3a.uclm.es 18. Mueller F (1999) Pthreads library interface. Institut fur Informatik 19. Wagner T, Towsley D (1995) Getting started with POSIX threads. Department of Computer Science, University of Massachusetts 20. Dagum L (1997) Open MP: a proposed industry standard API for shared memory programming. OpenMP.org

336

J. J. Águila et al.

21. Dagum L, Menon R (1998) Open MP: an industry-standard API for shared-memory programming. IEEE Comput Sci Eng 5:46–55 22. Águila JJ, Marín I, Arias E, Artigao MM, Miralles JJ (2010) Distributed memory implementation of the false nearest neighbors method: kd-tree approach versus box-assisted approach. In: Lecture notes in engineering and computer science: proceedings of the World Congress on engineering 2010, WCE 2010, 30 June–2 July, London, UK, pp 493–498

Chapter 28

Ethernet Based Implementation of a Periodic Real Time Distributed System Sahraoui Zakaria, Labed Abdennour and Serir Aomar

Abstract This work presents the realization of a platform for testing and validating distributed real time systems (DRTS), by following a methodology of development. Our main contribution remains the realization of an industrial communication bus (FIP: Factory Instrumentation Protocol) implemented on an Ethernet platform. It focuses on improving the response time of the bus. For that, we use a deterministic implementation of FIP’s services (variables identification, transmission functions, …) by exploiting the TCP/IP stack. The periodic communications are monitored by real time periodic threads, run on RTAI kernel.

28.1 Introduction Real time and distributed industrial systems development often rests on an appropriate methodology. Their implementation may be based on using a programming language or on using a fast prototyping tool that involves simulators, code generators and hardware in the loop. So, the design follows a model, architecture and a language appropriate to the applied methodology. The validation step of such systems in particular, requires platforms and middlewares to distribute the controls, the computation or the data. These platforms insure the management of the field buses. S. Zakaria (&) L. Abdennour S. Aomar Computer Science Department, EMP, BP 17 BEB, Algiers, Algeria e-mail: [email protected] L. Abdennour e-mail: [email protected] S. Aomar e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_28, Springer Science+Business Media B.V. 2011

337

338

S. Zakaria et al.

Distributed real time applications must satisfy two conditions to communicate data: determinism and reliability. Conventionally, industrial local area networks or any networks in hostile environment (engine of a vehicle) use field buses as the controller area network (CAN) and the FIP buses to fit these two requirements. The FIP field bus is a platform offering a configuring interface allowing a station1 to take place in the FIP network. Hence, it can produce and consume periodic or aperiodic variables, send and receive messages with or without connection or assure the arbitration function of the bus. The arbitrator holds the list of variables which are created on configuration. For each variable the producer and its periodicity of use on the bus are defined [1, 2]. Industrial networks tend to exploit the possibilities of the ETHERNET, which has been proven to be well adapted since it has a lower performance/cost ratio. In the present work we describe a platform that we have developed, to test the DRTS. We have adopted the methodology proposed by Benaissa and Serir in [3]. It is an approach of development based on a descendent hierarchical functional decomposition of a control system. The primitive processes of the system are specified by elementary grafcets that communicate and synchronize using messages. The Grafcets processes may refer to components of the same site or of different sites (distributed). The link between sites is provided by a communication system similar to the FIP field bus with a deterministic, periodic and synchronous behavior. Our contribution concerns a FIP configuration on ETHERNET. So, we have projected the FIP bus architecture (protocols and services) on its counterpart, the Ethernet communication system together with TCP/IP. This aims at checking the specification of the resulting FIP bus to fit all the mechanisms of FIP: the use of broadcasting in the communication system, acknowledgment of variables identifiers by the different sites and errors verification services (physical, link and application layers of OSI). Consequently, the validation will rest on performance of the real time micro kernel RTAI2 we have used. We will analyze the TCP/IP/Ethernet model in Sect. 28.2. In Sect. 28.3 we describe the approach of design of the proposed FIP field bus. In Sect. 28.4, we implement an example of a distributed real time application specified according to the methodology. Section 28.5 resumes and discusses the results of the tests. We conclude in Sect. 28.6.

28.2 Related Works Data exchange between and within field buses requires high rate communications. However, the majority of the field buses such as WorldFip, FF, Profibus, P-Net and CAN suggest insufficient rates (31.25–5 Mbps). To overcome this problem, 1

Part of the bus that can be, a computer, an automaton, a sensor or an actuator. Open source, preemptible without latency problems of operating system calls and has a known gigue. 2

28

Ethernet Based Implementation

339

high rate networks as FDDI and ATM were proposed and their capability to support strict time constraints and soft real time has been evaluated. However, these solutions didn’t meet expectations because of their high costs and complexity of implementation [4]. Some improvements have been operated on the Ethernet to help it support communications constrained by the time. So, the Ethernet protocol is modified or a deterministic layer is implemented over the MAC sublayer. One solution consists in using the TDMA strategy. But it has the drawback to waste the time slots of idle stations (no transmissions). As examples, we can cite the P-CSMA (Prioritized CSMA), RTNET of RTAI micro kernel. The PCSMA (Predictable CSMA) technique is data scheduling oriented where all real time data are assumed to be periodic. Though it avoids time waste, it has an overhead in its off line scheduling. We can also find techniques based on the modification of the binary exponential backoff (BEB) algorithm, like CSMA/DCR (CSMA deterministic collision avoidance) which uses a binary tree research instead of the non deterministic BEB [5]. Indeed, such techniques may support strict as well as soft real time applications by changing the basic structure of the Ethernet. Moreover, adding a deterministic layer upon MAC, may lead to the same result. Among these solutions, we have the Virtual Time protocol (VTP), the Window Protocol (WP) and the traffic smoothing (TS) [4]. Middleware-based protocols of communication have been recently proposed for applications in automation. They are implemented either on TCP, like Modbus TCP and ProfiNet or on UDP, such as NDDS (Network Data Delivery Service) [6, 7]. We end the list of technical solutions by Avionics Full DupleX switched Ethernet base 100 TX (AFDX).

28.3 Proposed Approach (FIP over ETHERNET/IP) The platform design is the result of Ethernet/TCP/IP model and its counterpart FIP’s layers analysis.

28.3.1 Analysis of the Physical Layer The WorldFIP norm defines three transmission rates: 31,25 kbit/s with a bit transmission time of Tbit = 32 ls, 1 Mbit/s with Tbit = 1 ls and 2,5 Mbit/s with Tbit = 400 ns [1]. So, the Ethernet rate varies between 10 Mbit/s with Tbit = 512 ls and 10 Gbit/s with Tbit = 512 ns [8]. The Manchester II coding is used for Ethernet with 10 Mbit/s and for the WorldFIP. For instance, Ethernet 100BaseTx uses the 4B/5B coding which limits the Tbit to 8 ms [8]. Note that the effects of noise on the FIP bus are similar to those on Ethernet categories which use short Tbit and low modulation speed.

340

S. Zakaria et al.

Fig. 28.1 Computation of the delay for the periodic traffic

Fig. 28.2 Link layer frames’ format of WorldFIP (CENELEC EN61158-2 norm of FIP) [1]

28.3.2 Analysis of the Data Link Layer The first step consists in analyzing the protocols ARP3 (address resolution protocol) and LLC (logical layer control) of TCP/IP and the services that may affect the traffic, in order to preserve a deterministic behavior. For our experiments, we used a LAN with one HUB and two PCs on which we installed a linux system provided with a traffic analyzer (Ethereal). For the Ethernet traffic capture, we noticed that in absence of traffic, three queries may occur on the network: two LLC control queries (initialization), two ARP with the sender address (generated by default every 180 s) and active services queries. Furthermore, when we replaced the Hub by a switch, we noticed that it also uses the LLC protocol to initialize the network (SSAP query: Spanning tree BPDU4 command with a forward delay of 15 ns). To avoid non-determinism of the protocol, it suffices to fix the duration BaseReachable-Time at an unreachable value. But, wasted time to ensure a deterministic emission or reception (periodic queries of identification, every 180 s) can be bounded by (n + 1) Tra. In this relation, n is the ratio between duration of an emission and the period of an ARP query and Tra is the transmission time of an ARP query (Fig. 28.1).

28.3.3 Benefit of the Data Link Layer The FIP frame format on the link level (Fig. 28.2) involves a control byte to code the frame’s type (ID_DAT, RP_DAT, ID_RQI, RP_ACK …), data bytes (128 bytes

3

ARP: protocol of layer three which makes the correspondence between Internet logical addresses and MAC addresses. 4 Used by switches and routers to avoid loops on a WAN.

28

Ethernet Based Implementation

341

Fig. 28.3 Ethernet II frame

Fig. 28.4 The Ethernet IEEE 802.3 frame

or 256 octets) and two bytes allowing a receiver to check the integrity of the received frame [1]. Source and destination MAC addresses of Ethernet frames [5] (Figs. 28.3 and 28.4) have no role in the specification, since sites’ identification in FIP bus has no interest at this level. Using Ethernet CSMA/CD protocol, transmission errors are not detected through the absence of acknowledgment, but through interference. In FIP implementation over ETHERNET, the temporization is assured implicitly. However the frames (identifier, variables response, query response …) are processed at transport and application layers. Error control is required by the CRC for both FIP and Ethernet using the same code.

28.3.4 Benefit of the Network Layer (IP) In the OSI standard, the TCP/IP protocol offers routing operations. So, interconnection between any pair of machines is possible. But in FIP network, sites identification is implicitly provided by the identifiers of variables to be transmitted. Recall that client sites of FIP system are synchronized only by variables identifiers and IP address will not give information on variables identifiers. Consequently, in our design, sites participating to the exchange of a variable are implicitly identified as producer of this variable via the broadcasting principle. The use of HUBs offers implicitly the broadcasting possibility. But, if we use switches, an appropriate configuration is necessary (configure ports on promiscuous mode).

28.3.5 Benefit of the UDP Layer UDP protocol has been the unique solution for many tools of real time applications implementation. The nature of UDP datagrammes is ideal for sending fragments of data generated by such applications. It is selected for the speed of communication between its clients. It uses a simplified structure of the header, which restricts to the fields shown in Fig. 28.5. The checksum of the header is computed as for IP packets.

342 Fig. 28.5 The UDP datagramme or message format

S. Zakaria et al. 0

7 Source Port

15

23

31 Destination Port

Length

Checksum Data

Fig. 28.6 Architecture of the adopted transport level

Arbitrator table A

Consumer table C

28.4 Adopted Architecture for the Transport Level Architecture of Fig. 28.6 shows the solution we have chosen among much other architecture. Use of tables P and C to structure producer and consumer variables, simplifies managing variables at the processing step. Different port numbers are used for the arbitrator and the producer–consumer sites, to separate messages intended to identify variables and those which contain variables. Using a unique port instead of more than one at each producer/consumer site allows synchronous design of production and consumption functions. A producer site receives identifiers ID_DAT of the variables to be produced. The consumer detects the arrival of these frames in order to enable an internal temporizer. If this temporizer expires, the station considers the next frame only if it has the emitting port of the arbitrator. Recall that messages exchange on the FIP bus is done in point to point or in multipoint on the same segment. Two addresses of 24 bits (source and destination) allow coding the number of the segment of the application entity and its address on this segment. Hence, IP addressing may be used to perform these transactions.

28.5 Maximal Transferring Time (Critical Time) Real time and distributed applications impose temporal constraints tasks achievement; these constraints will have direct impact on the exchanged messages between tasks located on different processors. In real time applications, tasks may have or not temporal constraints as well as the exchanged messages between them. As indicated on Fig. 28.7, the transferring time of a message is composed of several intermediate times which are summarized in Table 28.1. If we note Dt as the duration of a transaction, then:

28

Ethernet Based Implementation

Fig. 28.7 Transmission times

343

FIP Task

FIP Task

TCP / IP

TCP / IP

Layers s

Layers

MAC

MAC

Sub layer

Sub layer Medium

Transmission and Propagation time

Table 28.1 Notation used for transmission times Identifiers transmission time Notation Variables transmission time Identifier sending (FIP arbitrator task) Latest Sendto(socket,UDP…) MAC emission Transmission on the medium MAC reception Recvfrm(Socket, UDP…) function Acknowledgement (FIP producer task)

Tta Tst Tsm Tp Trm Trt Trp

Variable sending time (FIP production task) Latest Sendto(socket,UDP…) MAC emission time Transmission on the medium MAC reception time Recvfrm(Socket, UDP…) function Checking (FIP arbitrator task)

Dt ¼ Tta þ 2 Tst þ 2 Tsm þ 2 Tp þ 2 Trm þ 2 Trt þ 2 Trp þ Ttp. . .

Notation Ttp Tst Tsm Tp Trm Trt Trp

ð28:1Þ

Given the fact that in our protocol, service is carried out by sources of indeterminism, which are all periodic, the maximal time of periodic transaction is computed as: Dt Tta þ 2Tst þ 2Tsm þ 2Tp þ 2Trm þ 2Trt þ 2 Trp þ Ttp þ ðn þ 1Þ Tra. . . ð28:2Þ

28.6 Implementation of the Communication System 28.6.1 Hardware and Software of the Platform Each component of our case materialized by a PC provided with an Ethernet network interface controller 100BaseT, is considered as a site. On each site the RTAI system is installed. We have used four: three are Pentium 4 with CPU of

344

S. Zakaria et al.

Fig. 28.8 Modified macrocycle

2.39 GHz; the fourth is a Pentium 2 with CPU of 233 MHz. The first machine plays the role of arbitrator, the second and the fourth the role of producers– consumers (site 1 and site 2). The third has a monitoring role by analyzing traffic using the Ethereal tool.

28.6.2 Our Arbitration Function Variables’ identifiers are scheduled on a macrocycle as follows: • associate a task to each identifier; • each task must be triggered periodically in the macrocycle at a precise time of the elementary cycle to broadcast the identifier; • arbitrator executes tasks thanks to a preemptive scheduler with priority. A task is elected by scheduler only if all its preceding tasks have been achieved; • remaining time after execution of all the tasks in a microcycle will be used for aperiodic exchange; • awakening date of a periodic task i is computed by taking into account the total transfer time of all the previous transactions of a microcycle (Fig. 28.8).

28.6.3 The Producer–Consumer Function We will specify production and consumption tasks of a site. The first task to be executed is the production function, because, the client site have first of all to wait for an eventual identifier of a variable using the primitive (recvfrom). Then, the producer task scrutinizes its table to check if it is concerned by the variable associated to this identifier in which case it broadcasts the variable. If the site is not the producer, the same task scrutinizes its consumption table to check if it has to receive the variable on the same port using the same primitive. On the other hand, in the consumption processing, the task enables an internal temporizer to confirm the frame loss at expiration of this temporizer and assure the global order of the system.

28

Ethernet Based Implementation

345

28.6.4 Use of Real Time FIFO Mechanism Technically, it was not possible to compile a new Ethernet network driver over RTAI. So, ETHERNET of Linux system is used via the mechanism of real time queues (rt_FIFO), to communicate between ordinary processes and RTAI real time processes. The arbitrator creates two rt_FIFOs for its services; the reason is that, the primitive (rtf_get) used to read variables will use another mechanism of asynchronous nature. This primitive has been put in a function that we have called monitoring.

28.6.5 Monitoring Function This function focuses on the variables exchange via real time queues and computes the transaction time. It is automatically enabled by the arrival of a variable in the queue (linux process has inserted the variable in the queue). This mechanism is assured by rtf_create_handler(fifo, monitoring_func) primitive. Hence, unlike the FIP bus variables, the rt_FIFOs’ buffer has no refreshment and promptitude problem.

28.6.6 Schedulability In our implementation we used an arbitrator which involves a set of RTAI periodic tasks and another function for the producer–consumer site. The latter is sequentially executed and respects the order of a FIP transaction. The fact that arbitrator tasks are periodic makes it possible to apply the schedulability criterion of formula (28.3) and to compute maximal times of transactions execution. i¼n X

DTi =PTi 1. . .

ð28:3Þ

i¼1

DTi: ith FIP periodic transaction time, PTi: ith FIP period transaction.

28.7 A Case Study In this example, the application is composed of two distributed grafcets in the communication system (Fig. 28.9).

346

S. Zakaria et al.

Fig. 28.9 Example of distributed grafcets

Fig. 28.10 Arbitrator table

Communication between the two grafcets requires transmission of input a(m1) and state X21(m2) of site 2 to site 1 and transmission of state X20(m3) from site 1 to site 2, for each period of the macrocycle. Message m1 is transmitted during the first elementary cycle. Messages m2 and m3 are transmitted during the second one in the order m2, m3. To assure a coherent arbitration, we have bounded the time of a transaction, and consequently, the time to be added to the periods of variables. So, we compute the values resulting from subtractions between the clock value read after every sending and the corresponding value of the clock, sent by the monitoring function. Then, the upper bound is the maximum of the obtained results (Fig. 28.10).

28.7.1 Experimental Results To compare our solution to the original FIP, we have considered the parameters: time of transactions and the duration of production function. Since we have initialized the period, in timer tick, (periodic mode) to 119 ticks or 100,000 ns, a time value in tick of the clock is converted to nanoseconds by multiplying it by 840.336.

28

Ethernet Based Implementation

347

Fig. 28.11 Interpretation result

Table 28.2 Transaction times in milliseconds Test 1 Test 2

Test 3

Test 4

Average Maximum Minimum

0.700 1.2 0.1

0.322 0.5 0.1

0.460 1.2 0.3

0.600 1.2 0.3

We give bellow two sets of results (Fig. 28.11), corresponding respectively to a setup with a 10BaseT HUB and with a 100BaseT switch. Notice that we have used cables of UTP5 category. We have estimated the time used by a producer to produce the frame response of variable mi and the associated propagation time. Values sent by the identifier sending task and those sent by the monitoring function are given in tick of clock. The results of subtractions between the durations are converted into seconds.

28.7.2 Discussions Table 28.2, gives an idea about some measured times. It is obvious that transaction times may be lowered using a switch. The maximal values are almost equivalent for all the tests and vary between 0.5 and 1.2 ms. These results are due to the fact that the production times are often less than half of transaction times, which explains the slowness and inderminism of emission and reception function of linux arbitrator.

348

S. Zakaria et al.

Table 28.3 Sample of the original FIP scan speed [3] Scanned variables Variable size (bytes) Scanned variables

Variable size (bytes)

320 304 277 235

16 32 64 128

1 2 4 8

181 123 75 42

Fig. 28.12 Comparison of scrutation speeds

For example, if we take the value 1.2 ms which represents time of the seventh transaction of test 3, and the value 0.047 ms the time of its production. We notice that delay is due to emission and reception functions of the arbitrator (Fig. 28.11). Another example, concerns the maximal value obtained in test 1, it corresponds to the transaction of variable m3. This value of 0,979 ms gives the production time of the variable by the slowest machine (site 2).

28.7.3 Speed of Variables Scrutation Note that FIP Network at 2.5 Mbps, with a reversal time of 10 ms, in an element cycle of 20 ms, we can scrutinize (Table 28.3). From Table 28.2, we can get the interval of variation for arbitrator scan speed (Fig. 28.12). We can notice that for the switch, the scan speed may reach 200 variables per 20 ms. However, for the FIP this speed is reached if the size of variables decreases from 16 to 8 bytes.

28.7.4 Useful Bit Rate The useful bit rate is the ratio of the effective information and the duration of a transaction. For variables of size L: 1 \ L \ 16 bytes, a MAC frame has always a size of 64 bytes. But, if the variables’ size L is greater than 16 bytes, the size of the MAC frame will vary between 64 and 1500 bytes. Nevertheless, transmitted

28

Ethernet Based Implementation

349

Table 28.4 Comparison between the standard FIP and implemented FIP Useful data Transmitted data Transaction Useful bit rate (byte) (byte) time (ms) (Kbps) FIP

Implemented FIP

4 8 16 32 64 128 4 8 16 32

19 or (16 ? 4) 23 31 47 79 143 64 64 64 79 or (47 ? 32)

0.072 0.084799 0.013800 0.020200 0.033000 0.058600 0.099999 0.099999 0.099999 0.099999

444.44 754.72 1159.42 1584.16 1939.39 2184.30 5120 5120 5120 6320

Efficiency (%) 17.77 19.27 32.32 48.85 65.64 79.25 05.12 05.12 05.12 06.32

information is segmented into frames of 1500 bytes if the variables’ size exceeds 1453 bytes.

28.7.5 Efficiency We now compute the efficiency of our solution as the ratio between emission time of effective information and the duration of the transaction. It is equivalent to the useful bit rate and the transmission rate ratio. Table 28.4 compares the FIP’s [2] efficiency and that of our implementation. To complete the table, we deduce FIP transaction time from the ratio: length of useful information and useful bit rate. To compute the useful bit rate, we take the global minimum of all the durations (tests of previous example). The tests on the implemented FIP, with a variable of 32 bytes gave the same minimums. This table gives an idea about the margin that we have on the size of data that we can transmit in a transaction. Hence, efficiency is increased if we add other services (aperiodic variable exchange).

28.8 Conclusions and Future Work We have compared the results obtained for the distributed grafcets and the standard FIP on a practical example. The comparison was concerned with the scanning speed of arbitration table and the computation of the communication system efficiency in its cyclic part. Results obtained using switched Ethernet (switches) show that arbitrator of the implemented FIP can scrutinize its arbitration table faster than the standard FIP.

350

S. Zakaria et al.

The implemented platform confirms the goodness of the distributed grafcet model. It constitutes by itself a new design for implementation of distributed real time systems which can be qualified as distributed systems for field data base management.

References 1. WorldFIP tools FIPdesigner hlp technologies (2000) L M 2 - C N F - 2 - 0 0 1 – D, 12 Jul 2. WorldFip Protocole (1999) European standard, En 50170. http://www.WorldFIP.org 3. Benaissa M (2004) GRAFCET based formalism for design of distributed real time systems. Master Thesis, EMP Bordj-El-Bahri, Algiers, Algeria (in French) 4. Wang Z, Song YQ, Chen JM, Sun YX (2001) Real time characteristics of Ethernet and its improvement. In: Proceeding of the world congress on intelligent control and automation, June 5. Pujolle G (1997) Networks, 2nd edn. Eyrolles (in French) 6. Venkatramani C, Chiueh T, Supporting real-time traffic on Ethernet. In: 1052-8725/94 IEEE 7. Doléjs O, Hanzalék Z (2003) Simulation of Ethernet for real time applications. In: IEEE, ICIT—Maribor, Slovenia 8. Telecommunication and Networks, Claude Sevin, Dunod 2006 (in French)

Chapter 29

Preliminary Analysis of Flexible Pavement Performance Data Using Linear Mixed Effects Models Hsiang-Wei Ker and Ying-Haur Lee

Abstract Multilevel data are very common in many fields. Because of its hierarchical data structure, multilevel data are often analyzed using Linear MixedEffects (LME) models. The exploratory analysis, statistical modeling, and the examination of model-fit of LME models are more complicated than those of standard multiple regressions. A systematic modeling approach using visualgraphical techniques and LME models was proposed and demonstrated using the original AASHO road test flexible pavement data. The proposed approach including exploring the growth patterns at both group and individual levels, identifying the important predictors and unusual subjects, choosing suitable statistical models, selecting a preliminary mean structure, selecting a random structure, selecting a residual covariance structure, model reduction, and the examination of the model fit was further discussed.

29.1 Introduction Longitudinal data are used in the research on growth, development, and change. Such data consist of measurements on the same subjects repeatedly over time. To describe the pattern of individual growth, make predictions, and gain more insight

H.-W. Ker (&) Department of International Trade, Chihlee Institute of Technology, Taipei, 220, Taiwan e-mail: [email protected] Y.-H. Lee Department of Civil Engineering, Tamkang University, Taipei, 251, Taiwan e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_29, Springer Science+Business Media B.V. 2011

351

352

H.-W. Ker and Y.-H. Lee

into the underlying causal relationships related to developmental pattern requires studying the structure of measurements taken on different occasions [1]. Multivariate analysis of variance (MANOVA), repeated measures ANOVA, and standard multiple regression methods have been the most widely used tools for analyzing longitudinal data. Polynomial functions are usually employed to model individual growth patterns. Classical longitudinal data analysis relies on balanced designs where each individual is measured at the same time (i.e., no missing observations). MANOVA, which imposes no constraints on residual covariance matrix, is one common approach in analyzing longitudinal data. However, an unconstrained residual covariance structure is not efficient if the residual errors indeed possess a certain structure, especially when this structure is often of interest in longitudinal studies. Repeated measures ANOVA have the assumption of sphericity. It is too restrictive for longitudinal data because such data often exhibit larger correlations between nearby measurement than between measurements that are far apart. The variance and covariance of the within-subject errors also vary over time. The sphericity assumption is inappropriate in longitudinal studies if residual errors exhibit heterogeneity and dependence. In longitudinal studies, the focus is on determining whether subjects respond differently under different treatment conditions or at different time points. The errors in longitudinal data often exhibit heterogeneity and dependence, which call for structured covariance models. Longitudinal data typically possess a hierarchical structure that the repeated measurements are nested within an individual. While the repeated measures are the first level, the individual is the second-level unit and groups of individuals are higher level units [2]. Traditional regression analysis and repeated measures ANOVA fail to deal with these two major characteristics of longitudinal data. Linear Mixed-Effects (LME) models are an alternative for analyzing longitudinal data. These models can be applied to data where the number and the spacing of occasions vary across individuals and the number of occasions is large. LME models can also be used for continuous time. LME models are more flexible than MANOVA in that they do not require an equal number of occasions for all individuals or even the same occasions. Moreover, varied covariance structures can be imposed on the residuals based on the nature of the data. Thus, LME models are well suited for longitudinal data that have variable occasion time, unbalanced data structure, and constrained covariance model for residual errors. A systematic modeling approach using visual–graphical techniques and LME models was proposed and demonstrated using the original AASHO road test flexible pavement data [3]. The proposed approach including characterizing the growth patterns at both group and individual levels, identifying the important predictors and unusual subjects, choosing suitable statistical models, selecting random-effects structures, suggesting possible residuals covariance models, and examining the model-fits will be further discussed [4–7].

29

Preliminary Analysis of Flexible Pavement Performance Data

353

29.2 Methods Hierarchical linear models allow researchers to analyze hierarchically nested data with two or more levels. A two-level hierarchical linear model consists of two submodels: individual-level (level-1) and group-level (level-2). The parameters in a group-level model specify the unknown distribution of individual-level parameters. The intercept and slopes at individual-level can be specified as random. Substituting the level-2 equations for the slopes into the level-1 model yields a linear mixed-effects (LME) model. LME models are mixed-effects models in which both fixed and random effects occur linearly in the model function [8]. In a typical hierarchical linear model, the individual is the level-1 unit in the hierarchy. An individual has a series of measurements at different time points in longitudinal studies [9]. When modeling longitudinal data, the repeated measurements are the level-1 units (i.e., a separate level below individuals). The individual is the second-level unit, and more levels can be added for possible group structures [2]. The basic model at the lowest level, also regarded as repeated-measures level, for the application of hierarchical linear model in longitudinal data can be formulated as: Level - 1: Ytj ¼ b0j þ b1j ctj þ b2j xtj þ rtj

ð29:1Þ

where Ytj is the measure for an individual j at time t, ctj is the time variable indicating the time of measurement for this individual, xtj is the time-varying covariate, and rtj is the residual error term. b0j ¼ c00 þ c01 Wj1 þ u0j b Level - 2: 1j ¼ c10 þ c11 Wj1 þ u1j b2j ¼ c20 þ c21 Wj1 þ u2j

ð29:2Þ

In this level-2 equation, W is the time-invariant covariate for this individual. After substituting level-2 equations into level-1, the combined or the linear mixed-effects model is: Ytj ¼ ½c00 þ c10 ctj þ c20 xtj þ c01 Wj1 þ c11 Wj1 ctj þ c21 Wj1 xtj þ ½u0j þ u1j ctj þ u2j xtj þ rtj

ð29:3Þ

The level-1 model is a within-individuals model and the level-2 model is a between-individuals model [10]. Note that there is no time-invariant covariate in level-2 before introducing the variable W. The variance and covariance of the u’s are the variances and covariances of the random intercept and slopes. After introducing the variable W, the variance and the covariance of u’s are the variance and covariance of residual intercept and slopes after partitioning out the variable W. More time-invariant variables can be added sequentially into level-2 to get different models. The reduction in variance of u’s could provide an estimate of variance in intercepts and slopes accounted for by those W’s [11]. This linear mixed-effects model does not require that every individual must have the same

354

H.-W. Ker and Y.-H. Lee

number of observations because of possible withdrawal from study or data transmission errors. Let Ytj denotes the tth measurement on the jth individual, in which t = 1, 2, …, ni measurements for subject j, and j = 1, 2, …, N individuals. The vector Yj is the collection of the observations for the jth individual. A general linear mixed-effects model for individual j in longitudinal analysis can be formulated as: Y j ¼ X j b þ Z j U j þ Rj

ð29:4Þ

where Xj is an (nj 9 p) design matrix for the fixed effects; and b is a (p 9 1) vector of fixed-effect parameters. Zj is an (nj 9 r) design matrix for the random effects; and Uj is an (r 9 1) vector of random-effect parameters assumed to be independently distributed across individuals with a normal distribution, Uj * NID(0, T). The Uj vector captures the subject-specific mean effects as well as reflects the extra variability in the data. Rj is an (nj 9 1) vector for the residuals. The within errors, Rj, are assumed normally distributed with mean zero and variance r2Wj, where Wj (stands for ‘‘within’’) is a covariance matrix with a scale factor r2. The matrix Wj can be parameterized by using a few parameters and assumed to have various forms, e.g., an identity matrix or the first-order of autoregression or moving-average process [12, 13]. They are independent from individual to individual and are independent of random effects, Uj. Other choices for variance–covariance structures that involve correlated withinsubject errors have been proposed. Using appropriate covariance structures can increase efficiency and produced valid standard errors. The choice among covariance depends upon data structures, subject-related theories and available computer packages. In some cases, heterogeneous error variances can be employed in the model because the variances in this model are allowed to increase or decrease with time. The assumption of common variance shared by all individuals is removed [12, 14]. LME models generally assume that level-1 residual errors are uncorrelated over time. This assumption is questionable for longitudinal data that have observations closely spaced in time. There typically exists dependence between adjacent observations. This is called serial correlation and it tends to diminish as the time between observations increases. Serial correlation is part of the error structure and if it is present, it must be part of the model for producing proper analysis [12]. If the dependent within-subject errors are permitted, the choice of the model to represent the dependence needs careful consideration. It would be preferable to incorporate as much individual-specific structure as possible before introducing a serial correlation structure into within-subject errors [15].

29.3 Data Description The AASHO road test was a large-scale highway research project conducted near Ottawa, Illinois from 1958 to 1960, and has had by far the largest impact on the history of pavement performance analysis. The test consisted of six loops,

29

Preliminary Analysis of Flexible Pavement Performance Data

355

numbered 1–6. Each loop was a segment of a four-lane divided highway and centerlines divided the pavements into inner and outer lanes, called lane 1 and lane 2. Pavement designs varied from section to section. All sections had been subjected to almost the same number of axle load applications on any given date. Performance data was collected based on the trend of the pavement serviceability index at 2-week interval. The last day of each 2-week period was called an ‘‘index day.’’ Index days were numbered sequentially from 1 (November 3, 1958) to 55 (November 30, 1960) [3, 7, 16]. Empirical relationships between pavement thickness, load magnitude, axle type, accumulated axle load applications, and performance trends for both flexible and rigid pavements were developed after the completion of the road test. Several combinations of certain rules, mathematical transformations, analyses of variance, graphs, and linear regression techniques were utilized in the modeling process to develop such empirical relationships. A load equivalence factor was then established to convert different configurations of load applications to standard 18-kip equivalent single-axle loads (ESAL). This ESAL concept has been adopted internationally since then. As pavement design evolves from traditional empirically based methods toward mechanistic-empirical, the ESAL concept used for traffic loads estimation is no longer adopted in the recommended MechanisticEmpirical Pavement Design Guide (MEPDG) [17], although many researchers have argued that it is urgently in need of reconsideration [3, 18, 19]. During the road test, it was found that the damage rate was relatively low in winter but was relatively high in spring for flexible pavements. Therefore, load applications were adjusted by ‘‘seasonal weighting function’’ such that a better ‘‘weighted’’ flexible pavement equation was developed. Lee [18] has pointed out that the error variance increases when the predicted number of weighted load repetitions (W) increases. To serve the needs of predicting pavement serviceability index (PSI) after certain load applications on a given section, it is not uncommon that engineers would rearrange the original flexible pavement equation into the following form: 0:4þ 10945:19 ½logðESALÞ9:36logðSNþ1Þþ0:2 ðSNþ1Þ PSI ¼ 4:2 2:7 10 ð29:5Þ SN ¼ 0:44D1 þ 0:14D2 þ 0:11D3 In which the regression statistics are: R2 = 0.212, SEE = 0.622, N = 1083 [18]. Note that PSI ranges from 0 to 5 (0–1 for very poor; 1–2 for poor; 2–3 for fair; 3–4 for good; and 4–5 for very good conditions). D1 is the surface thickness (in.); D2 is the base thickness (in.); D3 is the subbase thickness (in.).

29.4 Exploratory Analysis Exploratory analysis is a technique to visualize the patterns of data. It is detective work of exposing data patterns relative to research interests. Exploratory analysis of longitudinal data can serve to: (a) discover as much of the information regarding

356

H.-W. Ker and Y.-H. Lee

Fig. 29.1 Mean PSI for each subject (loop/lane) versus index day

subject

3.5 3.0 2.5

Mean PSI

4.0

loop1/lane1 loop2/lane1 loop3/lane2 loop6/lane2 loop2/lane2 loop3/lane1 loop4/lane1 loop5/lane1 loop5/lane2 loop4/lane2 loop6/lane1

0

20

40

60

Index Day

raw data as possible rather than simply summarize the data; (b) highlight mean and individual growth patterns which are of potential research interest; as well as (c) identify longitudinal patterns and unusual subjects. Hence plotting individual curves to carefully examine the data should be performed first before any formal curve fitting is carried out. For the nature of this flexible pavement data, the exploratory analysis includes exploring ‘‘growth’’ patterns and the patterns regarding experimental conditions.

29.4.1 Exploring ‘‘Growth’’ Patterns The first step, which is perhaps the best way to get a sense of a new data, is to visualize or plot the data. Most longitudinal data analyses address individual growth patterns over time. Thus, the first useful exploratory analysis is to plot the response variable against time including individual and overall mean profiles. Individual mean profiles, which summarize the aspects of response variable for each individual over time, can be used to examine the possibility of variations among individuals and to identify potential outliers. The overall mean profile summarizes some aspects of the response variable over time for all subjects and is helpful in identifying unusual time when significant differences arise. Figure 29.1 shows the lines connecting the dependent variable (mean PSI) over time for each subject (loop/lane). Most subjects have higher mean PSIs at the beginning of the observation period, and they tend to decrease over time. The spread among the subjects is substantially smaller at the beginning than that at the end. In addition, there exist noticeable variations among subjects. The overall mean growth curve over time indicates that the overall mean PSIs are larger at the beginning and decrease over time; and the rate of deterioration is higher at the beginning than that at the end.

29

Preliminary Analysis of Flexible Pavement Performance Data

357

29.4.2 Exploring the Patterns of Experimental Conditions In addition to time (in terms of index day), different major experimental conditions may be considered. This exploratory analysis is intended to discover the overall and individual patterns of each experimental condition and their interactions on mean PSIs. Subsequently, the patterns of mean PSIs for each subject and the patterns of overall mean PSIs on each experimental condition and their interactions over time are investigated [7]. Generally speaking, the mean PSIs for pavements with higher surface thickness are higher than those with thinner surface layer.

29.5 Linear-Mixed Effects Modeling Approach The following proposed modeling approach is generally applicable to modeling multilevel longitudinal data with a large number of time points. Model building procedures including the selection of a preliminary mean structure, the selection of a random structure, the selection of a residual covariance structure, model reduction, and the examination of the model fit are subsequently illustrated.

29.5.1 Selecting a Preliminary Mean Structure Covariance structures are used to model variation that cannot be explained by fixed effects and depend highly on the mean structures. The first step to model building is to remove the systematic part and remove this so that the variation can be examined. The dataset includes the following explanatory variables: thick, basethk, subasthk, uwtappl, FT. In which, thick is the surface thickness (in.); basethk is the base thickness (in.); subasthk is the subbase thickness (in.); uwtappl is the unweighted applications (millions), and FT is monthly the freeze–thaw cycles. A model containing all main effects, and all the two-way, three-way interaction terms was first investigated. This model (called model-1) has the form: PSIij ¼ b0j þ b1j ðthickÞij þ b2j ðbasethkÞij þ b3j ðsubasthkÞij þ b4j ðuwtapplÞij þ b5j ðuwtapplÞ2ij þ b6j ðFTÞ þ two-way interaction terms of thick, basethk, subasthk, and uwtappl þ three-way interaction terms of thick, basethk, subasthk, and uwtappl þ Rij ð29:6Þ

358

H.-W. Ker and Y.-H. Lee

29.5.2 Selecting a Preliminary Random Structure The second step is to select a set of random effects in the covariance structure. An appropriately specified covariance structure is helpful in interpreting the random variation in the data, achieving the efficiency of estimation, as well as obtaining valid inferences of the parameters in the mean structure of the model. In longitudinal studies, the same subject is repeatedly measured over time. The data collected from longitudinal study is a collection of correlated data. The within-subject errors are often heteroscedastic (i.e., having unequal variance), correlated, or both.

29.5.2.1 Exploring Preliminary Random-Effects Structure A useful tool to explore the random-effects structure is to remove the mean structure from the data and use ordinary least square (OLS) residuals to check the need for a linear mixed-effects model and decide which time-varying covariate should be included in the random structure. The boxplot of residuals by subject corresponding to the fit of a single linear regression by using the same form of the preliminary level-1 model was conducted. This is the case when grouping structure is ignored from the hierarchy of data. Since the residuals are not centered around zero, there are considerable differences in the magnitudes of residuals among subjects. This indicates the need for subject effects, which is precisely the motivation for using linear mixed-effects model. Separate linear regression models were employed to fit each subject to explore the potential linear relationship. To assist in selecting a set of random effects to be included in the covariance model, the plots of mean raw residuals against time and the variance of residuals against time are useful. If only random-intercepts models hold, the residual has the form, eij ¼ U0j þ Rij , in which U0j is the random effect for intercepts and Rij is the level-1 error. If this plot shows constant variability over time or the curves are flat, then only random intercept model is needed. If random-intercepts-and-slopes models hold, the residual has the form, eij ¼ U0j þ U1j x1ij þ þ Uqj xqij þ Rij , where Uqj is the random effect for the qth slope. In the case of random-interceptsand-slopes model, the plot would show the variability varies over time or there are some unexplained systematic structures in the model. One or more random effects, additional to random intercept, have to be added.

29.5.2.2 Selecting a Variance–Covariance Structure for Random Effects Three possible variance–covariance structures including general positive definite (unstructured), diagonal, and block-diagonal based on different assumptions [8] were investigated. General positive-definite is a general covariance matrix

29

Preliminary Analysis of Flexible Pavement Performance Data

359

Table 29.1 Model comparison using three variance–covariance structures Model df AIC BIC logLik Test L. ratio

p-Value

(1) Unstr (2) Diag (3) Bk-diag

\0.0001 0.0177

29 22 21

12910.29 13056.52 13060.14

13117.74 13213.90 13210.37

-6426.14 -6506.26 -6509.07

1 vs 2 2 vs 3

160.234 5.621

parameterized directly in terms of variances and covariances. Diagonal covariance structure is used when random-effects are assumed independent. Block-diagonal matrix is employed when it is assumed that different sets of random effects have different variances. Table 29.1 displays the model comparison of these three models. The unstructured model has the smallest absolute value of log-likelihood among them. The likelihood ratio test for unstructured model versus diagonal model is 160.23 with p-value less than 0.0001. Thus, unstructured variance– covariance model will be used hereafter. The random effects of the preliminary level-2 model include intercept, uwtappl, quadratic term of uwtappl, and FT. The variance–covariance structure is a general positive-definite matrix. Putting the preliminary level-1 and level-2 models together, the preliminary linear-mixed-effects model is then: PSIij ¼ c00 þ c10 ðthickÞij þ c20 ðbasethkÞij þ c30 ðsubasthkÞij þ c40 ðuwtapplÞij þ c50 ðuwtapplÞ2ij þ c60 ðFTÞij þ c70 ðthick basethkÞij þ c80 ðthick subasthkÞij þ c90 ðbasethk uwtapplÞij þ c100 ðsubasthk uwtapplÞij þ c110 ðbasethk subasthk uwtapplÞij þ c120 ðthick basethk subasthk uwtapplÞij þ U0j þ U4j ðuwtapplÞij þ U5j ðuwtapplÞ2ij þ U6j ðFTÞij þ Rij ð29:7Þ

29.5.3 Selecting a Residual Covariance Structure The absolute value of log-likelihood for this heteroscedastic model is 6273.29. The need of heteroscedastic model can be formally checked by using the likelihood ratio test [7]. The small p-value indicates that the heteroscedastic model explains the data significantly better than homoscedastic model. Correlation structures are used to model dependence among the within-subject errors. Autoregressive model with order of 1, called AR(1), is the simplest and one of the most useful models [8]. The autocorrelation function (ACF), which begins autocorrelation at lag 1 and then declines geometrically, for AR(1) is particularly simple. Autocorrelation functions for autoregressive model of order greater than one are typically oscillating or sinusoidal functions and tend to damp out with increasing lag [20].

360

H.-W. Ker and Y.-H. Lee

Thus, AR(1) may be adequate to model the dependency of the within-subject errors. The absolute value of log-likelihood for this heteroscedastic AR(1) model is 6207.24. The estimated single correlation parameter / is 0.125. The heteroscedastic model (corresponding to / = 0) is nested within the heteroscedastic AR(1) model. Likewise, the need of heteroscedastic AR(1) model can be checked using likelihood ratio test [7]. The small p-value indicates that the heteroscedastic AR(1) model explains the data significantly better than heteroscedastic model, suggesting that within-group serial correlation is present in the data.

29.5.4 Model Reduction After specifying the within-subject error structure, the next step is to check whether the random-effects can be simplified. It is also desirable to reduce the number of parameters in fixed effects in order to achieve a parsimonious model that can well represent the data. A likelihood ratio test statistic, whose sampling distribution is a mixture of two chi-squared distributions, is used to test the need for random-effects. The p-value is determined by equal weight of the p-values of a mixture of two chi-squared distributions. To assess the significance of the terms in the fixed effects, conditional t-tests are used.

29.5.4.1 Reduction of Random Effects The matrix of known covariates should not have polynomial effect if not all hierarchically inferior terms are included [21]. The same rule applies to interaction terms. Hence, significance tests for higher-order random effects should be performed first. The random effects included in the preliminary random-effects structure are: intercept, uwtappl, uwtappl2, and FT. The models and the associated maximum log-likelihood values are compared [7]. The small p-value indicates that the preliminary random-effects structure explains the data significantly better than the others. Thus, no reduction of random effects is needed.

29.5.4.2 Reduction of Fixed Effects An adequate and appropriately specified random-effects structure implies efficient model-based inferences for the fixed effects. When considering the reduction of fixed effects, one model is nested within the other model and the random-effects structures are the same for the full and the reduced models. Likelihood ratio tests are appropriate for the model comparison. The parameter estimates, estimated standard errors, t-statistics and p-values for the fixed effects of the heteroscedastic AR(1) model are revisited. The heteroscedastic AR(1) model can be reduced to a

29

Preliminary Analysis of Flexible Pavement Performance Data

361

Table 29.2 Proposed preliminary LME model Intercept Random effects Standard deviation

0.170

uwtappl 1.679

uwtappl2

FT

0.765

Residual

0.00722

0.448

Parameter

Value

Std. error

DF

t-Value

p-Value

Fixed effects (Intercept) thick basethk subasthk uwtappl uwtappl2 FT thick*basethk thick*subasthk basethk*uwtappl subasthk*uwtappl thick*basethk*uwtappl thick*subasthk*uwtappl basethk*subasthk*uwtappl thick*basethk*subasthk*uwtappl

2.4969 0.2629 0.0590 0.0386 -3.6191 1.1524 0.0148 -0.0062 -0.0082 0.1275 0.1355 -0.0155 -0.0077 -0.0291 0.0073

0.0703 0.0122 0.0066 0.0041 0.5254 0.2481 0.0023 0.0016 0.0010 0.0172 0.0181 0.0045 0.0036 0.0029 0.0006

9423 9423 9423 9423 9423 9423 9423 9423 9423 9423 9423 9423 9423 9423 9423

35.51 21.48 8.91 9.37 -6.89 4.65 6.39 -3.81 -8.07 7.40 7.50 -3.43 -2.16 -9.87 11.53

\0.0001 \0.0001 \0.0001 \0.0001 \0.0001 \0.0001 \0.0001 \0.0001 \0.0001 \0.0001 \0.0001 0.0006 0.0307 \0.0001 \0.0001

Note Model fit: AIC = 12481.77, BIC = 12710.69, logLik = -6208.89. Correlation structure: AR(1); parameter estimate(s): Phi = 0.126. Variance function structure: for different standard deviations per stratum (thick = 2, 1, 3, 4, 5, 6 in.), the parameter estimates are: 1, 1.479, 0.935, 1.199, 0.982, 0.959

more parsimonious model due to the existence of some insignificant parameter estimates. The reduction of fixed effects starts with removing the parameters with largest p-values, insignificant terms, and combining the parameters not changing significantly. These processes are repeated until no important terms have been left out of the model.

29.5.5 Proposed Preliminary LME Model The final proposed preliminary linear mixed-effects model is listed in Table 29.2. The fixed-effects structures of the proposed model contain significant treatment effects for thick, basethk, subasthk, uwtappl, uwtappl2, FT, and several other two-, three-, and four-way interaction terms. The positive parameter estimates for thick, basethk, and subasthk indicates that higher mean PSI values tend to occur on thicker pavements. The parameter estimate of uwtappl is negative indicating that lower PSI values for higher load applications. Furthermore, the preliminary LME model also indicates that: The standard error for the pavements with surface thickness of 1 in. or 4 in. is about 48 or 20% higher

362

H.-W. Ker and Y.-H. Lee

than those with surface thickness of 2 in., respectively. There exists dependency in within-subject errors. The estimated single correlation parameter for the AR(1) model is 0.126.

29.5.6 Examination of the Model Fit A plot of the population predictions (fixed), within-group predictions (Subject), and observed values versus time for the proposed preliminary LME model by subjects. Population predictions are obtained by setting random-effects to zero whereas within-group predictions use estimated random effects [7]. The prediction line of the within-group predictions follows the observed values more closely indicating the proposed LME model provides better explanation to the data.

29.6 Conclusions A systematic modeling approach using visual-graphical techniques and LME models which is generally applicable to modeling multilevel longitudinal data with a large number of time points was proposed in this paper. The original AASHO road test flexible pavement data was used to illustrate the proposed modeling approach. Exploratory analysis of the data indicated that most subjects (loop/lane) have higher mean PSIs at the beginning of the observation period, and they tend to decrease over time. The spread among the subjects is substantially smaller at the beginning than that at the end. In addition, there exist noticeable variations among subjects. A preliminary LME model for PSI prediction was developed. The positive parameter estimates for thick, basethk, and subasthk indicates that higher mean PSI values tend to occur on thicker pavements. The parameter estimate of uwtappl is negative indicating that lower PSI values for higher load applications. The prediction line of the within-group predictions (Subject) follows the observed values more closely than that of the population predictions (fixed) indicating the proposed LME model provides better explanation to the data.

References 1. Goldstein H (1979) The design and analysis of longitudinal studies. Academic Press, Inc, New York 2. Hox JJ (2000) Multilevel analysis of grouped and longitudinal data. In: Little TD, Schnabel KU, Baumert J (eds) Modeling longitudinal and multilevel data: practical issues, applied approaches and specific examples. Lawrence Erlbaum Associates, Mahwah, pp 15–32 3. Highway Research Board (1962) The AASHO road test, report 5, pavement research, special report 61E. National Research Council, Washington

29

Preliminary Analysis of Flexible Pavement Performance Data

363

4. Ker HW (2002) Application of regression spline in multilevel longitudinal modeling. Doctoral Dissertation, University of Illinois, Urbana 5. Lee YH, Ker HW (2008) Reevaluation and application of the AASHTO mechanisticempirical pavement design guide, phase I, summary report, NSC96-2211-E-032-036. National Science Council, Taipei City (In Chinese) 6. Lee YH, Ker HW (2009) Reevaluation and application of the AASHTO mechanisticempirical pavement design guide, phase II, NSC97-2221-E-032-034, summary report. National Science Council, Taipei City (In Chinese) 7. Ker HW, Lee YH (2010) Preliminary analysis of AASHO road test flexible pavement data using linear mixed effects models. In: Lecture notes in engineering and computer science: proceedings of the world congress on engineering 2010, WCE 2010, 30 June–2 July, London, UK, pp 260–266 8. Pinherio JC, Bates DM (2000) Mixed-effects models in S and S-plus. Springer, New York 9. Laird NM, Ware JH (1982) Random effects models for longitudinal data. Biometrics 38: 963–974 10. Anderson CJ (2001) Model building. http://www.ed.uiuc.edu/courses/edpsy490ck 11. MacCallum RC, Kim C (2000) Modeling multivariate change. In: Little TD, Schnabel KU, Baumert J (eds) Modeling longitudinal and multilevel data: practical issues, applied approaches and specific examples. Lawrence Erlbaum Associates, NJ, pp 51–68 12. Jones RH (1993) Longitudinal data with serial correlation: a state-space approach. Chapman & Hall, London 13. Vonesh EF, Chinchilli VM (1997) Linear and nonlinear models for the analysis of repeated measurements. Marcel Dekker, Inc, New York 14. Carlin BP, Louis TA (1996) Bayes and empirical Bayes methods for data analysis. Chapman & Hall, London 15. Goldstein H, Healy MJR, Rasbash J (1994) Multilevel time series models with application to repeated measures data. Stat Med 13:1643–1655 16. Huang YH (2004) Pavement analysis and design, 2nd edn. Pearson Education, Inc., Upper Saddle River 17. ARA, Inc (2004) ERES consultants division, guide for mechanistic-empirical design of new and rehabilitated pavement structure. NCHRP 1–37A report. Transportation Research Board, National Research Council, Washington 18. Lee YH (1993) Development of pavement prediction models. Doctoral dissertation, University of Illinois, Urbana 19. Ker HW, Lee YH, Wu PH (2008) Development of fatigue cracking performance prediction models for flexible pavements using LTPP database. J Transp Eng ASCE 134(11):477–482 20. Pindyck RS, Rubinfeld DL (1998) Econometric models and economic forecasts, 4th edn. McGraw-Hill, Inc, New York 21. Morrell CH, Pearson JD, Brant LJ (1997) Linear transformations of linear mixed-effects models. Am Stat 51:338–343

Chapter 30

Chi-Squared, Yule’s Q and Likelihood Ratios in Tabular Audiology Data Muhammad Naveed Anwar, Michael P. Oakes and Ken McGarry

Abstract In this chapter, we have used the chi-squared test and Yule’s Q measure to discover associations in tables of patient audiology data. These records are examples of heterogeneous medical records, since they contain audiograms, textual notes and typical relational fields. In our first experiment we used the chisquared measure to discover associations between the different fields of audiology data such as patient gender and patient age with diagnosis and the type of hearing aid worn. Then, in our second experiment we used Yule’s Q to discover the strength and direction of the significant associations found by the chi-squared measure. Finally, we examined the likelihood ratio used in Bayesian evidence evaluation. We discuss our findings in the context of producing an audiology decision support system.

M. N. Anwar (&) M. P. Oakes Department of Computing, Engineering & Technology, University of Sunderland, Sunderland, UK e-mail: [email protected] M. P. Oakes e-mail: [email protected] K. McGarry Department of Pharmacy, Health and Well-Being, University of Sunderland, Sunderland, UK e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_30, Ó Springer Science+Business Media B.V. 2011

365

366

M. N. Anwar et al.

30.1 Introduction Association measures can be used to measure the strength of relationship between the variables in medical data. Discovering associations in medical data has an important role in predicting the patient’s risk of certain diseases. Early detection of any disease can save time, money and painful procedures [1]. In our work we are looking for significant associations in heterogeneous audiology data with the ultimate aim of looking for factors influencing which patients would most benefit from being fitted with a hearing aid. Support and confidence are measures of the interestingness of associations between variables [2, 3]. They show the usefulness and certainty of discovered associations. Strong associations are not always interesting, because support and confidence do not filter out uninteresting associations [4]. Thus, to overcome this problem a correlation measure is augmented to support and confidence. One of the correlation measures popularly used in the medical domain is chisquared (v2). In Sect. 30.2 we describe our database of audiology data. We first use the chisquared measure to discover significant associations in our data, as described in Sect. 30.3. We then use Yule’s Q measure to discover the strength of each of our significant associations, as described in Sect. 30.4. In Sect. 30.5, we briefly describe our findings for the support and confidence for each of the significant associations. In Sect. 30.6, we use Bayesian likelihood ratios to find associations between words in the comments fields and the type of hearing aid fitted. We draw our conclusions in Sect. 30.7.

30.2 Audiology Data In this study, we have made use of audiology data collected at the hearing aid outpatient clinic at James Cook University Hospital in Middlesbrough, England, UK. The data consists of about 180,000 individual records covering about 23,000 audiology patients. The data in the records is heterogeneous, consisting of the following fields: 1 Audiograms, which are the graphs of hearing ability at different frequencies (pitches). 2 Structured data: gender, date of birth, diagnosis and hearing aid type, as stored in a typical database, e.g. |M|, |09-05-1958|, |TINNITUS|, |BE18|. 3 Textual notes: specific observations made about each patient, such as |HEARING TODAY NEAR NORMAL—USE AID ONLY IF NECESSARY|. In general, these audiology records represent all types of medical records because they involve both structured and unstructured data.

30

Chi-Squared, Yule’s Q and Likelihood Ratios in Tabular Audiology Data

367

30.3 Discovery of Associations with the Chi-Squared Test Tables The chi-squared test is a simple way to provide estimates of quantities of interest and related confidence intervals [5]. It is a measure of associations between variables (such as the fields of the tables in a relational database) where the variables are nominal and related to each other [6]. The Chi-squared test is popular in the medical domain because of its simplicity. It has been used in pharmacology to classify text according to subtopics [7]. The resulting chi-squared value is a measure of the differences between a set of observed and expected frequencies within a population, and is given by the formula [5]: v2 ¼

r X c X ðOij Eij Þ2 i¼1 j¼1

Eij

where r is the number of unique terms in a particular field of the patient records such as diagnosis or hearing aid type, corresponding to rows in Table 30.1. c is the number of categories in the data (such as age or gender) corresponding to columns in Table 30.1. Table 30.1 is produced for two diagnoses occurring in the hearing diagnosis field. For example, if 535 of the hearing diagnosis fields of the records of patients ‘Aged B 54’ years contained the diagnosis ‘tinnitus’, we would record a value of 535 for that term being associated with that category. These values were the ‘observed’ values, denoted Oij in the formula above. The corresponding ‘expected’ values Eij were found by the formula: Row total Column total=Grand Total The row total for ‘tinnitus’ diagnosis is the total number of times the ‘tinnitus’ diagnosis was assigned to patients in both age categories = 535 ? 592 = 1127. The column total for ‘Age B 54’ is the total number of patients in that age group over all two diagnoses = 702. The grand total is the total number of patient records in the study = 1364. Thus the ‘expected’ number of patients diagnosed with ‘tinnitus’ in the ‘Age B 54’ group was 1127 * 702/1364 = 580.02. The significance of this is that the expected value is greater than the observed value, suggesting that there is a negative degree of association between the ‘tinnitus’

Table 30.1 Observed and expected frequencies for diagnosis Diagnosis Age B 54 Age [ 54

Row total

Not-tinnitus Tinnitus Column total

237 1127 1364

167 (121.98) [2027.24] 535 (580.02) [2027.24] 702

70 (115.02) [2027.24] 592 (546.98) [2027.24] 662

Expected frequencies are in ( ); (observed frequency - expected frequency)2 are in [ ]

368

M. N. Anwar et al.

diagnosis and the category ‘Age B 54’. The remainder of the test is then performed to discover if this association is statistically significant. Since we were in effect performing many individual statistical tests, it was necessary to use the Bonferroni correction [5] to control the rate of Type I errors where a pair of variables spuriously appear to be associated. For example, for us to be 99.9% confident that a particular keyword was typical of a particular category, the corresponding significance level of 0.001 had to be divided by the number of simultaneous tests, i.e. the number of unique words times the number of categories. In the case of words in the text fields, this gave a corrected significance level of 0.001/(2 * 2) = 0.00025. Using West’s chi-squared calculator [8], for significance at the 0.001 level with one degree of freedom, we obtained a chi-squared threshold of 13.41. Thus each word associated with a category with a chi-squared value of more than 13.41 was taken to be significantly associated with that category at the 0.001 level. The overall chi-squared values for the relationship between the test variables age and gender with hearing aid type (behind the ear—BTE/in the ear—ITE) are shown in Table 30.2. The overall chi-squared value for the relationship between the words in the comments text and hearing aid type was calculated by summing the chi-squared values of all possible text word—BTE/ITE right aid pairs, and is also shown in Table 30.2. This data shows, with 99.9% confidence, that these text words were not randomly distributed, but some text words are probably associated with hearing aid type. Similarly the associations of each of the variables (age, comments text, gender and hearing aid type) with tinnitus diagnosis are shown in Table 30.3. Here we see that there are significant associations between age, comments text, and BTE/ITE right aid with a diagnosis of tinnitus, but there are no significant associations between gender and tinnitus diagnosis.

Table 30.2 Overall v2 with BTE/ITE right aid

Table 30.3 Overall v2 with tinnitus diagnosis

Fields

Overall v2

Degrees of freedom (df)

P

Age Comments text Gender

10.53 5421.84 33.68

1 663 1

\0.001 \0.001 \0.001

Fields

Overall v2

Degrees of freedom (df)

P

Age Comments text Gender BTE/ITE raid

41.45 492.26 0.18 31.75

1 60 1 1

\0.001 \0.001 =0.6714 \0.001

30

Chi-Squared, Yule’s Q and Likelihood Ratios in Tabular Audiology Data

369

To use the chi-squared test the expected frequency values must be all at least 1, and most should exceed 5 [9]. To be on the safe side, we insisted that for each word, all the expected values should be at least 5, so all words failing this test were grouped into a single class called ‘OTHERS’. Keywords associated with categories with 95% confidence were deemed typical of those categories if O [ E, otherwise they were deemed atypical. The keywords most typical and atypical of the four categories (hearing aid type, age, tinnitus and gender) are shown in Tables 30.4 and 30.5. A ‘keyword’ could either be a category type (where * denotes a diagnosis category, and *** denotes a hearing aid category), or a word from the free-text comments field (denoted **). The discovered associations seem intuitively reasonable. For example, it appears that the patients with ‘Age B 54’ tend not to have tinnitus, and patients not having tinnitus had a problem of wax and were using BTE hearing aids. The words tinnitus (ringing in the ears) and masker (a machine for producing white noise to drown out tinnitus) were atypical of this category. It was found that males tended more to use ITE hearing aids and females tended more to use BTE hearing aids. The hearing aid types associated with BTE were those with high gain and had changes made to the ear mould. Similarly, ITE hearing aid types used lacquer, vents, required reshelling of ear impressions, had changes made to the hearing aid, were reviewed and the wearers were making progress. For these experiments, we used all the records available in the database for each field under study, keeping the criterion that none of the field values should be empty. In Table 30.4, 70 was calculated as the median age of the BTE/ITE right aid group and in Table 30.5, 54 was the median age of the records with a nottinnitus or tinnitus diagnosis. In Tables 30.4 and 30.5 some keywords in the comments text were abbreviations such as ‘reshel’ for ‘reshell’ and ‘fta’ for ‘failed to attend appointment’. ‘Tinnitus’ appears as ‘tinnitu’ in the tables, since all the text was passed through Porter’s stemmer [10] for the removal of grammatical endings.

Table 30.4 Categories with positive and negative keywords in records with BTE/ITE right aid Positive keywords Negative keywords Age B 70 Age [ 70 BTE

ITE

Male Female

*Not found *Not found **mould, be34, map, gp, 92, audio, inf, be52, ref, staff, reqd, be36, contact **fta, reshel, appt, it, nn, nfa, 2001, rev, lacquer, hn, km, imp, review, 2000, nh, vent, progress, aid, dt, taken ***ITE ***BTE

*Not found *Not found **fta, reshel, appt, it, nn, nfa, 2001, rev, lacquer, hn, km, imp, review, 2000 **mould, be34, map, gp, 92, audio, inf, be52, ref, staff, reqd, be36, contact, tri, n, order ***BTE ***ITE

370

M. N. Anwar et al.

Table 30.5 Categories with positive and negative keywords in records with a tinnitus/not-tinnitus diagnosis Positive keywords Negative keywords Age B 54 Age [ 54 Not-tinnitus

*Not-tinnitus *Not found **OTHERS, lost, ear, wax, L, aid ***BTE

Tinnitus

**masker, tinnitu ***Not found ***Not found ***Not found

Male Female

*Not found *Not-tinnitus **masker, tinnitu, rev, help, appt, 2001, 2000, counsel, ok, further, progress, fta ***ITE **OTHERS ***Not found ***Not found ***Not found

30.4 Measures of Association in Categorical Data Yule’s Q is a measure to find the strength of association between categorical variables. Unlike the chi-squared test, which tells us how certain we can be that a relationship between two variables exists, Yule’s Q gives both the strength and direction of that relationship [6]. In the following 2 9 2 table,

Present Absent

Present

Absent

A C

B D

Yule’s Q is given by Q¼

AD BC AD þ BC

ð2Þ

where A, B, C and D are the observed quantities in each cell. Yule’s Q is in the range -1 to +1, where the sign indicates the direction of the relationship and the absolute value indicates the strength of the relationship. Yule’s Q does not distinguish complete associations (where one of the cell values = 0) and absolute relationships (where two diagonally opposite cell values are both zero), and is only suitable for 2 9 2 tables. In Tables 30.6, 30.7, 30.8, and 30.9 Yule’s Q values for age with comment text, diagnosis, hearing aid type, and mould are given. Similarly, in the Table 30.10, 30.11, and 30.12 Yule’s Q values for gender with comment text, hearing aid type and mould are given. ‘(P)’ and ‘(A)’, stand for present and absent.

30

Chi-Squared, Yule’s Q and Likelihood Ratios in Tabular Audiology Data

371

In Table 30.6, a Yule’s Q value of 0.75 shows that there is a positive association between the keyword ‘progress’ and the category ‘Age B 70’, which can be restated as a negative association between the keyword ‘progress’ and the category ‘Age [ 70’. In Table 30.7, for ‘diagnosis’ there is an absolute association between ‘familial’ and ‘Age B 54’, resulting in a Yule’s Q value of 1. This should be viewed in comparison to the chi-squared value for the same association, 17.20 (P \ 0.001), showing both that the association is very strong and that we can be highly confident that it exists. The presence of this association shows that a higher proportion of younger people report to the hearing aid clinic with familial (inherited) deafness than older people. Familial deafness is relatively rare but can affect any age group, while ‘OTHERS’ would include ‘old-age deafness’ (presbycusis) which is relatively common, but obviously restricted to older patients. However, in Table 30.9, Yule’s Q for ‘V2’ is 0.18, which shows only a weak association between mould and ‘Age B 70’, while the chi-squared value for the same association of 30.25 (P \ 0.001), showed that it is highly likely that the association exists. In Table 30.6 Yule’s Q for comment text and age Comment text Age B 70 (P) Age [ 70 (P)

Age B 70 (A)

Age [ 70 (A)

Yule’s Q

Progress Dna Masker Tinnitus Help Counsel 2000 Fta Gp Wax Ref Contact Insert Reqd Cic Staff Map Dv Reinstruct

46833 46821 46361 46541 46704 46735 46638 46384 46162 46191 46284 46495 46509 46517 46522 46515 46517 46503 46524

45555 45548 45442 45445 45484 45488 45443 45236 55060 55074 55188 55546 55573 55564 55599 55543 55550 55430 55607

0.75 0.67 0.63 0.51 0.44 0.40 0.38 0.23 -0.16 -0.19 -0.24 -0.49 -0.58 -0.72 -0.73 -0.73 -0.75 -0.75 -0.75

93 105 565 385 222 191 288 542 370 341 248 37 23 15 10 17 15 29 8

13 20 126 123 84 80 125 332 615 601 487 129 102 111 76 132 125 245 68

Table 30.7 Yule’s Q for diagnosis and age Diagnosis Age B 54 (P) Age [ 54 (P)

Age B 54 (A)

Age [ 54 (A)

Yule’s Q

Familial OTHERS

684 589

662 618

1.00 0.46

18 113

0 44

372

M. N. Anwar et al.

Table 30.8 Yule’s Q for hearing aid type and age Hearing aid type Age B 70 (P) Age [ 70 (P) Age B 70 (A)

Age [ 70 (A)

Yule’s Q

PFPPCL PPCL BE101 PPC2 ITENL OTHERS ITEHH – BE34 ITENH ITENN BE36

10899 10895 10896 10894 10865 10863 10583 6953 10018 10308 9837 10697

0.95 0.88 0.83 0.79 0.55 0.46 0.26 0.12 -0.18 -0.21 -0.25 -0.37

42 78 44 53 123 103 536 4668 640 403 683 97

1 5 4 6 35 37 317 3947 882 592 1063 203

11105 11069 11103 11094 11024 11044 10611 6479 10507 10744 10464 11050

Table 30.9 Yule’s Q for mould and age Mould Age B 70 (P) Age [ 70 (P)

Age B 70 (A)

Age [ 70 (A)

Yule’s Q

N8 SIL V2 2107V1

10873 10879 10559 10533

10805 10798 10502 9986

0.47 0.43 0.18 -0.23

261 255 575 601

94 101 397 913

Table 30.10 Yule’s Q for comment text and gender Comment text M (P) F (P) M (A)

F (A)

Yule’s Q

He Wife Dv

55673 55673 55421

0.95 0.93 -0.45

Table 30.11 Yule’s Q for hearing aid type and gender Hearing aid type M (P) F (P) M (A)

F (A)

Yule’s Q

ITEHH ITENH ITEHN ITENN

12467 12373 10936 11630

0.58 0.47 -0.13 -0.14

67 44 80

665 725 1280 734

2 2 254

201 295 1732 1038

46465 46488 46452

11080 11020 10465 11011

Table 30.11, Yule’s Q for ‘ITEHN’ (a type of hearing aid worn inside the ear) is -0.13, which shows a weak negative association between ‘ITEHN’ and ‘male’, or in other words, a weak positive association between ‘ITEHN’ and ‘female’. In comparison, the chi-squared value for the same association of 43.36 (P \ 0.001), showed that we can be highly confident that the relationship exists. These results show the complementary nature of the chi-squared and Yule’s Q results: in all three cases the chi-squared value was highly significant, suggesting

30

Chi-Squared, Yule’s Q and Likelihood Ratios in Tabular Audiology Data

373

Table 30.12 Yule’s Q for mould and gender Mould M (P) F (P)

M (A)

F (A)

Yule’s Q

IROS V2 N8

11671 11111 11498

12644 12326 12527

0.57 0.35 0.32

80 640 253

24 342 141

that the relationship was highly likely to exist, while Yule’s Q showed the strength (strong in the first case, weak in the others) and the direction (positive in the first two cases, negative in the third) of the relationship differed among the three cases.

30.5 Support and Confidence for Associations We examined two measures of association commonly used in market basket analysis, support and confidence [4], for all relations between age and diagnosis, and gender and diagnosis. We were unable to find many rules with high support and confidence due to the very high proportion of one type of diagnosis (‘tinnitus’) in the records. However, we feel that given an audiology database where a diagnosis was routinely recorded for every patient, more rules in the form A ) B (A implies B) would be found. Our results are given in [11].

30.6 Likelihood Ratios for Associated Keywords In Bayesian Evidence Evaluation [6], the value of a piece of evidence may be expressed as a likelihood ratio (LR), as follows: LR ¼ PrðE=HÞ= PrðE=HÞ For example, our hypothesis (H) might be that a patient should be fitted with a BTE hearing aid as opposed to an ITE hearing aid. E is a piece of evidence such as the word ‘tube’ appearing in the patient’s comments field of the database. Pr(E/H) is then the probability of seeing this evidence given that the hypothesis is true. Of all the 34394 records where a patient was given a BTE aid, 29 of them contained is the the word ‘tube’, so in this case Pr(E/H) = 29/34394 = 0.000843. PrðE=HÞ probability of seeing the word ‘tube’ when the hypothesis is not true. Of all the 29455 records where a patient was given an ITE aid, only 2 of them was 2/29455 = 0.0000679. This gives contained the word ‘tube’, so here PrðE=HÞ an LR of 0.000843/0.0000675 = 12.41. Using Evett et al.’s [12] scale of verbal equivalences of the LR, an LR in the range 10–100 indicates moderate support for the hypothesis. LRs in the range 0.1–10 indicate only limited support either way, while an LR in the range 0.01 to 0.1 would indicate moderate support for the complementary hypothesis. The words giving the highest and lowest LR values

374 Table 30.13 Likelihood ratios for comments text and BTE/ITE right aids

M. N. Anwar et al. Word

BTE

ITE

LR

Adequ Audiometer Be10 Be201 Be301 Be37 Be51 Hac Temporari Therapy Be52 Be53 Be36 Be54 Retub Seri Cwc Tube Couldn’t Orig ‘‘map Map E Hn Progress Readi Concertina Unless Coat Cap Vc Hnv1 Hh Reshel Lacquer Facepl Window Total

14 10 18 18 13 12 13 11 11 13 68 26 57 35 34 16 15 29 14 14 13 116 12 8 4 1 1 1 1 1 1 1 1 6 2 0 0 34394

0 0 0 0 0 0 0 0 0 0 2 1 3 2 2 1 1 2 1 1 1 9 1 77 39 10 11 11 13 15 15 17 20 136 65 15 16 29445

NA NA NA NA NA NA NA NA NA NA 29.11 22.26 16.27 14.98 14.55 13.70 12.84 12.41 11.99 11.99 11.13 11.03 10.27 0.09 0.09 0.09 0.08 0.08 0.07 0.06 0.06 0.05 0.04 0.04 0.03 0 0

with respect to a BTE fitting as opposed to an ITE fitting are shown in Table 30.13, where NA indicates division by zero as the word never appeared in records for patients fitted with an ITE hearing aid. All words which were used in the chi-squared analysis (since their expected values were all 5 or more) were also considered for this analysis. LR values are useful for the combination of evidence. Using the evidence that the text comments field contains ‘lacquer’, ‘reshell’ and ‘progress’, we can

30

Chi-Squared, Yule’s Q and Likelihood Ratios in Tabular Audiology Data

375

estimate the likelihood of the patient requiring a BTE hearing aid by iteratively using the relationship ‘posterior odds = LR 9 prior odds’. Initially we obtain a prior odds (Pr(BTE)/Pr(ITE)) from a large sample or manufacturer’s data. Using the column totals in Table 30.13, the prior odds in favour of a BTE aid before any other evidence has been taken into account would be 34394/29445 = 1.168 to 1. Taking the first piece of evidence (the presence of the word ‘lacquer’ into account), the posterior odds are 0.03 9 1.168 = 0.035. This posterior odds value now becomes the prior odds for the second iteration. The LR for ‘reshell’ is 0.04, so the posterior odds become 0.04 9 0.035 = 0.0014. This posterior odds value now becomes the prior odds for the third iteration. The LR for ‘progress’ is 0.09, so the final posterior odds become 0.09 9 0.0014 = 0.000126. Since these posterior odds are much less than 1, it is much more likely that the patient should be fitted with an ITE hearing aid. This simple example shows the basis by which a Bayesian decision support system which returns the more suitable type of hearing aid could be constructed.

30.7 Conclusion In this work we have discovered typical and atypical words related to different fields of audiology data, by first using the chi-squared measure to show which relations most probably exist, then using Yule’s Q measure of association to find the strength and direction of those relations. The Likelihood Ratio, also based on the contingency table, provides a means whereby all the words in the comments field can be taken into account in a Bayesian decision support system for audiologists. We are currently working on the development of a Logistic Regression model, where the overall value log(Pr(BTE)/Pr(ITE)) will be a linear combination of the presence or absence of each of the discovered associated variables described in this chapter. Analogous reasoning will be used for models to predict whether or not a patient should be given a tinnitus masker, and whether or not he or she would benefit from a hearing aid fitting. Rules found by data mining should not only be accurate and comprehensible, but also ‘surprising’. McGarry presents a taxonomy of ‘interestingness’ measures whereby the value of discovered rules may be evaluated [13]. In this chapter, we have looked at objective interestingness criteria, such as the statistical significance of the discovered rules, but we have not yet considered subjective criteria such as unexpectedness and novelty. These require comparing machine-derived rules with the prior expectations of domain experts. A very important subjective criterion is ‘actionability’, which includes such considerations as impact: will the discovered rules lead to any changes in current audiological practice? Acknowledgments We wish to thank Maurice Hawthorne, Graham Clarke and Martin Sandford at the Ear, Nose and Throat Clinic at James Cook University Hospital in Middlesbrough, England, UK, for making the large set of audiology records available to us.

376

M. N. Anwar et al.

References 1. Pendharkar PC, Rodger JA, Yaverbaum GJ, Herman N, Benner M (1999) Association, statistical, mathematical and neural approaches for mining breast cancer patterns. Expert Syst Appl, Elsevier Science Ltd 17:223–232 2. Bramer M (2007) Principles of data mining. Springer, London, pp 187–218 3. Ordonez C, Ezquerra N, Santana CA (2006) Constraining and summarizing association rules in medical data. In: Cercone N et al (eds) Knowledge and information systems. Springer, New York, pp 259–283 4. Han J, Kamber M (2006) Data mining concepts and techniques, 2nd edn. Morgan Kaufmann Publishers, San Diego, pp 227–272 5. Altman DG (1991) Practical statistics for medical research. Chapman & Hall, London, pp 241–248, 211, 271 6. Lucy D (2005) Introduction to statistics for forensic scientists. Wiley, Chichester, pp 45–52,112–114,133–136 7. Oakes M, Gaizauskas R, Fowkes H et al (2001) Comparison between a method based on the chi-square test and a support vector machine for document classification. In: Proceedings of ACM SIGIR, New Orleans, pp 440–441 8. Chi-square calculator (2010). http://www.stat.tamu.edu/*west/applets/chisqdemo.html 9. Agresti A (2002) Categorical data analysis, 2nd ed. Wiley series in probability and statistics. Wiley, New York, p 80 10. Porter MF (1980) An algorithm for suffix stripping. Program 14(3):130–137 11. Anwar MN, Oakes MP, McGarry K (2010) Chi-squared and associations in tabular audiology data. In: Lecture notes in engineering and computer science: proceedings of the world congress on engineering 2010, WCE 2010, London, UK, vol 1, pp 346–351 12. Evett IW, Jackson G, Lambert JA, McCrossan S (2000) The impact of the principles of evidence interpretation and the structure and content of statements. Sci Justice 40:233–239 13. McGarry K (2005) A survey of interestingness measures for knowledge discovery. Knowl Eng Rev J 20(1):39–61

Chapter 31

Optimising Order Splitting and Execution with Fuzzy Logic Momentum Analysis Abdalla Kablan and Wing Lon Ng

Abstract This study proposes a new framework for high frequency trading using a fuzzy logic based momentum analysis system. An order placement strategy will be developed and optimised with adaptive neuro fuzzy inference in order to analyse the current ‘‘momentum’’ in the time series and to identify the current market condition which will then be used to decide the dynamic participation rate given the current traded volume. The system was applied to trading of financial stocks, and tested against the standard volume based trading system. The results show how the proposed Fuzzy Logic Momentum Analysis System outperforms the standard volume based systems that are widely used in the financial industry.

31.1 Introduction The modelling of financial systems continues to hold great interest not only for researchers but also for investors and policymakers. Many of the characteristics of these systems, however, cannot be adequately captured by traditional financial modelling approaches. Financial systems are complex, nonlinear, dynamically changing systems in which it is often difficult to identify interdependent variables and their values. In particular, the problem of optimal order execution has been a main concern for financial trading and brokerage firms for decades [1]. The idea of executing a A. Kablan (&) W. L. Ng Centre for Computational Finance and Economic Agents (CCFEA), University of Essex, Wivenhoe Park, Colchester, CO4 3SQ, UK e-mail: [email protected] W. L. Ng e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_31, Springer Science+Business Media B.V. 2011

377

378

A. Kablan and W. L. Ng

client’s order to buy or sell a pre-specified number of shares at a price better than all other competitors seems intriguing. However, this involves the implementation of a system that considers the whole price formation process from a different point of view. Financial brokers profit from executing clients’ orders of buying and selling of certain amounts of shares at the best possible price. Many mathematical and algorithmic systems have been developed for this task [2], yet they seem not to be able to overcome a standard volume based system. Most systems use well-documented technical indicators from financial theory for their observations. For example, [3] used three technical indicators in their stock trading system: the rate of change, the stochastic momentum indicator and a support-resistance indicator that is based on the 30-day price average. A convergence module then maps these indices as well as the closing price to a set of inputs for the fuzzy system, thus providing a total of seven inputs. In some cases, such as the rate of change, one indicator maps to a single input. However, it is also possible to map one indicator to multiple inputs. Four levels of quantification for each input value are used: small, medium, big and large. In this case, Mamdani’s form of fuzzy rules [4] can be used to combine these inputs and produce a single output variable with a value between 0 and 100. Low values indicate a strong sell, high values a strong buy. The system is evaluated using 3 years of historical stock price data from four companies with variable performance during one period and employing two different strategies (risk-based and performance-based). In each strategy, the system begins with an initial investment of $10,000 and assumes a constant transaction cost of $10. Similarly, tax implications are not taken into consideration. The resulting system output is shown to compare favourably with stock price movement, outperforming the S&P 500 in the same period. The application presented in this study differs from the above, as it introduces a fuzzy logic-based system for the momentum analysis [5]. The system uses fuzzy reasoning to analyse the current market conditions according to which a certain equity’s price is currently moving. This is then used as a trading application. First, the membership functions were decided by the expert-based method, but then later optimised using ANFIS [6], further improving the trading strategy and order execution results.

31.2 Fuzzy Logic Momentum Analysis System (Fulmas) 31.2.1 Fuzzy Inference A fuzzy inference system is a rule-based fuzzy system that can be seen as an associative memory and is comprised of five components: • A rule base which consists of the fuzzy if–then rules. • A database which defines membership functions of the fuzzy sets used in the fuzzy rules.

31

Optimising Order Splitting and Execution

379

• A decision-making unit which is the core unit and is also known as the inference engine. • A fuzzification interface which transforms crisp inputs into degrees of matching linguistic values. • A defuzzification interface which transforms fuzzy results into crisp output. Many types of fuzzy inference systems have been proposed in literature [7]. However, in the implementation of an inference system, the most common is the Sugeno model, which makes use of if–then rules to produce an output for each rule. Rule outputs consist of the linear combination of the input variables as well as a constant term; the final output is the weighted average of each rule’s output. The rule base in the Sugeno model has rules of the form: If X is A1 and Y is B1 ; then f1 ¼ p1 X þ q1 Y þ r1 : ð31:1Þ If X is A2 and Y is B2 ;

then

f2 ¼ p2 X þ q 2 Y þ r 2 :

ð31:2Þ

X and Y are predefined membership functions, Ai and Bi are membership values, and pi, qi and ri are the consequent parameters. When we calculate the equation of first-order Sugeno model [8], the degree of membership variable of x1 in membership function of Ai are multiplied by the degree of membership variable of x2 and in membership function Bi, and the product is weight Wi. Finally, the weighted average of f1 and f2 is deemed the final output Z, which is calculated as Z¼

W1 f1 þ W2 f2 : W1 þ W2

ð31:3Þ

In the case of designing a fuzzy system for financial modelling, one should opt to use a model similar to Mamdani and Assilian [4], which is based on linguistic variables and linguistic output. Basically, fuzzy logic provides a reasoning-like mechanism that can be used for decision making. Combined with a neural network architecture, the resulting system is called a neuro-fuzzy system. Such systems are used for optimisation since they combine the reasoning mechanism that fuzzy logic offers together with the pattern recognition capabilities of neural networks, which will be discussed in the following.

31.2.2 Adaptive Neuro Fuzzy Inference System (ANFIS) The ANFIS is an adaptive network of nodes and directional links with associated learning rules [6]. The approach learns the rules and membership functions from the data [8]. It is called adaptive because some or all of the nodes have parameters that affect the output of the node. These networks identify and learn relationships between inputs and outputs, and have high learning capability and membership function definition properties. Although adaptive networks cover a

380

A. Kablan and W. L. Ng Layer 1 A1

Layer 2

Layer 3 w1

Layer 4 w1

Layer 5 w1 f 1

X A2 Σ

F

B1 Y

w2

w2

w2 f 2

B2

Fig. 31.1 ANFIS architecture for a two rule Sugeno system

number of different approaches, for our purposes, we will conduct a detailed investigation of the method proposed by Jang et al. [9] with the architecture shown in Fig. 31.1. The circular nodes have a fixed input–output relation, whereas the square nodes have parameters to be learnt. Typical fuzzy rules are defined as a conditional statement in the form: If X is A1 ;

then Y is B1

ð31:4Þ

If X is A2 ;

then Y is B2

ð31:5Þ

However, in ANFIS we use the 1st-order Takagi–Sugeno system [8] shown in Eq. 31.1 and 31.2. ANFIS can also be used to design forecasting systems [10]. We briefly discuss the five layers in the following: 1. The output of each node in Layer 1 is: O1;i ¼ lAi ðxÞ

for i ¼ 1; 2

O1;i ¼ lBi2 ðxÞ for i ¼ 3; 4 Hence, O1,i(x) is essentially the membership grade for x and y. Although the membership functions could be very flexible, experimental results lead to the conclusion that for the task of financial data training, the bell-shaped membership function is most appropriate (see, e.g., Abonyi et al. [11]). We calculate lA ðxÞ ¼

1 2bi ; xci 1 þ ai

ð31:6Þ

where ai,bi,ci are parameters to be learnt. These are the premise parameters. 2. In Layer 2, every node is fixed. This is where the t-norm is used to ‘‘AND’’ the membership grades, for example, the product: O2;i ¼ Wi ¼ lAi ðxÞlBi ðyÞ;

i ¼ 1; 2:

ð31:7Þ

31

Optimising Order Splitting and Execution

381

3. Layer 3 contains fixed nodes that calculate the ratio of the firing strengths of the rules: O3;i ¼ Wi ¼

Wi : W1 þ W 2

ð31:8Þ

4. The nodes in Layer 4 are adaptive and perform the consequent of the rules: O4;i ¼ Wi fi ¼ Wi ðpi x þ qi y þ ri Þ:

ð31:9Þ

The parameters (pi, qi, ri) in this layer are to be determined and are referred to as the consequent parameters. 5. In Layer 5, a single node computes the overall output: P X W i fi O5;i ¼ Wi fi ¼ Pi ð31:10Þ i Wi i This is how the input vector is typically fed through the network layer by layer. We then consider how the ANFIS learns the premise and consequent parameters for the membership functions and the rules in order to optimise these in the Fuzzy Logic Momentum Analysis System to produce a further improved system with a higher performance.

31.2.3 Fulmas for Trading Creating a fuzzy inference system to detect momentum is a complex task. The identification of various market conditions has been a topic subject to various theories and suggestions [12]. In the following, the proposed fuzzy inference system categorises the market conditions into seven categories based on price movement, using the current volume to determine the participation rates (PR) of the trading system each time. The participation rate is the amount of volume that will be traded at each instance. The first step in designing the Fuzzy Logic Momentum Analysis System involves defining the ‘‘market conditions’’ that the fuzzy system has to identify. The following seven market conditions are used to cover all possible movements of the price series: • • • • • • •

Rallying Strong up Slightly up Average Slightly down Strong down Crashing

382

A. Kablan and W. L. Ng

These conditions are considered as linguistic values for the fuzzy logic system, and they will be used to determine the current state of the price formation and its momentum. As momentum builds, the system considers the previous x amount of ticks and performs an inference procedure by adding all of the movements of the current price to the previous price in order to determine whether the general trend has been up or down after x points. Let Pi denote the current price and Pi-1 the previous price; ki is a fluctuating counter that goes up or down according to the movement of the price. Whenever the price goes up, it adds 1, and when the price goes down, it subtracts 1. Hence, this can be used to identify market conditions price movements, where if the market is moving strongly upwards, it will be detected by having more +1 than -1 or 0. This can be modelled as MomentumðxÞ ¼

x X

ki

ð31:11Þ

i¼1

where x is the number of ticks where we want to detect the momentum. For example, if we want to detect the momentum of the last 100 ticks, we count all up and down movements and then feed the resulting number to the fuzzy system, whose output would lie somewhere in the membership functions. The choice of triangular membership functions was made after using the expert based method, where it was suggested that triangular membership functions should be used due to their mathematical simplicity. Triangular shapes require three parameters and are made of a closed interval and a kernel comprised of a singleton. This simplifies the choice of placing the membership functions. The expert merely has to choose the central value and the curve slope on either side. The same procedure is applied for calculating the linguistic variable ‘‘volatility’’, where the linguistic values are: • • • • •

Very high High Medium Low Very low

The fuzzy logic system considers both market momentum and volatility. It generates the rules and then takes a decision based on the amount of market participation. This is illustrated in Fig. 31.2.

31.3 Empirical Analysis Experiments in this study have been carried out on high-frequency tick data obtained from ICAP plc of both Vodafone Group plc (VOD) and Nokia Corporation (NOK). A very important characteristic of this type of data is that it is

31

Optimising Order Splitting and Execution

383

Fig. 31.2 Extracting fuzzy rules from both volatility and momentum

irregularly spaced in time, which means that the price observations (ticks) are taken in real-time (as they arrive). The application is designed for an interdealer broker,1 which means that they have the ability to create orders with any amount of volume. For both stocks, 2 months of high-frequency tick data between 2 January 2009 and 27 February 2009 has been obtained, simulations are terminated whenever 1 million shares have been bought or sold. The fuzzy logic system receives the first batch of data and performs all of the buy or sell actions on it. The same procedure is repeated using the standard volume-based system. Finally, the performance of both systems is compared. It must be mentioned that 2 months of high-frequency tick data is a significantly large amount of data; considering every iteration, the system analyses the momentum of the past 100 ticks (Fig. 31.3).

1

An interdealer broker is a member of a major stock exchange who is permitted to deal with market makers, rather than the public, and can sometimes act as a market maker.

384

A. Kablan and W. L. Ng VOD

NOK 100

12.5 12

145

11.5 11

140

Price

Price

10.5 10

135

9.5 130

9 8.5

125

8 120

7.5 2-JAN-2009

17-JAN-2009

1-Feb-2009

15_FEB-2009

2-Mar-2009

2-JAN-2009

17-JAN-2009

1-FEB-2009

15-FEB-2009 2-MAR-2009

Fig. 31.3 Time series data of NOK and VOD prices

31.3.1 Standard Volume System (SVS) A standard brokerage and trading mechanism for executing large orders is a simple volume-based system that parses the volume being traded whenever a certain number of shares (a threshold) have been traded; the system will buy or sell (depending on the order) a certain percentage. If there is an order to trade one million shares of a certain stock, the threshold could be, for example, 10,000 shares. Whenever 10,000 shares have been traded and if the participation rate PR is set to 25%, the system will buy or sell 25% of the average volume. If the accumulated sum of the volume exceeds the predefined threshold, then the amount of shares traded is equal to the PR multiplied by the current volume: Total SVS Cost ¼

n X

pricei ðamount of sharesi Þ

i¼1

where n is the number of operations required to reach the target order (for example, 1 million shares). The above system has proven to be efficient and is being adopted by many financial brokerage and execution institutions [13].

31.3.2 Benchmark Performance Measures Although many systems have used many different approaches such quantum modelling to determine the various participation rates (PR), they usually fail to outperform the standard volume system in the long term. The aim of this study is to prove that FULMAS outperforms this type of system in the long run, this is assessed using order execution costs for buy and sell orders. In particular, FULMAS will be applied to determine the PR in the market according to the current momentum. For example, for a buy order, it is preferable to increase the PR (number of shares bought at that time) when the price is low and to decrease the participation when the price is high. The idea here is to use the momentum analysis system to identify in what market condition we are currently residing in.

31

Optimising Order Splitting and Execution

Table 31.1 Participation rates for buy side and the sell side of FULMAS Rallying Strong up Slightly up Average Slightly down Strong down Crashing

385 Buying participation rates (%)

Selling participation rates (%)

10 15 20 25 30 35 40

40 35 30 25 20 15 10

This will enable us to vary the PR, providing a trading advantage, since the system can trade aggressively when the condition is at an extreme. It would also minimise its trading when the condition is at another extreme. In other words, if we are selling 1 million shares, the system will make a trade whenever the threshold of volume has been exceeded. However, if the current market condition indicates that the price is very high or rallying, then we know that this is a suitable time to sell a lot of shares, for example, 40% of the current volume. The same concept applies when the momentum indicates that the price is strong down, which means that the system should sell a lower volume at this low price, for example, 15%. The reverse mechanism applies for buying shares. When the market is crashing, this is a good indicator that we should buy a large volume (40%), and when the price is at an average point, it would behave like the SVS system, i.e., buying 25% of the volume. This is shown in Table 31.1. The same procedure is applied to volatility and then combined with volume to produce the fuzzy rules. When implementing SVS and FULMAS, the benchmark at which both systems will be compared against each other will be the outperformance of FULMAS on the SVS, expressed in basis points (one hundredth of 1%). To calculate the improvement (imp) for the buy and sell sides, the following formulas are used: FULMAS price impBuy ¼ 1 104 bps SVS price FULMAS price impSell ¼ 1 104 bps SVS price where FULMAS price is the total cost of buying x amount of shares using FULMAS, and SVS price is the total cost of buying the same number of shares using the traditional SVS.

31.3.3 Results The complimentary characteristics of neural networks and fuzzy inference systems have been recognised and the methodologies have been combined to create neuro-

A. Kablan and W. L. Ng Degree of membership

386 Initial MFs Crashing

1

StrongDown

SlightlyDown

Average

SlightlyUp

StrongUp

Rallying

0.8 0.6 0.4 0.2 0 0

10

20

30

40

50

60

70

80

90

100

Degree of membership

input2 Final MFs Crashing

1

StrongDown

SlightlyDown

Average

SlightlyUp

StrongUp

Rallying

0.8 0.6 0.4 0.2 0 0

10

20

30

40

50

60

70

80

90

100

input2

Fig. 31.4 Triangular membership functions optimised using ANFIS

fuzzy techniques. Indeed, earlier work by Wong and Wang [14] described an artificial neural network with processing elements that could handle fuzzy logic and probabilistic information, although the preliminary results were less than satisfactory. In this study, ANFIS is used to optimise the membership functions in FULMAS. This is performed by feeding the ANFIS system both the training data, the desired output, and tuning the ANFIS in order to reach the target result by modifying the membership functions (see Figs. 31.4 and 31.5). In other words, at each instance, ANFIS is fed the results currently obtained from the fuzzy system together with a set of target prices or data. This target price will be an optimal price that is far better than the current one (a cheaper price if on buy mode or a higher price if in sell mode). The system runs and modifies the membership

Degree of membership

Initial MFs Crashing

1 08 06 04 02

StrongDown

SightlyDown

Average

lightlyUp

StrongUp

Rallying

0 0

10

20

30

40

50

60

70

80

90

100

Degree of membership

input Final MFs 1 Crashing 0.8 0.6 0.4 0.2 0 0

StrongDown

10

20

SlightlyDown

30

40

Average

50

SlightlyUp

60

70

input

Fig. 31.5 Bell-shaped membership functions optimised using ANFIS

StrongUp

80

Rallying

90

100

31

Optimising Order Splitting and Execution

387

Table 31.2 Analysis of results of buying and selling 1 million shares of NOK and VOD with the descriptive statistics of the improvement indicators (in bps per trade) Mean Median Std dev Skewness Kurtosis Initial results Buying NOK Buying VOD Selling NOK Selling VOD Optimised results Buying NOK Buying VOD Selling NOK Selling VOD

2.98 12.48 1.68 2.73

4.63 1.58 2.92 2.46

12.39 36.25 8.79 27.71

-0.05 1.74 -1.43 0.70

2.56 4.86 6.25 8.84

6.94 14.48 9.36 7.71

6.57 4.33 5.79 6.91

12.99 2.95 9.18 28.23

0.15 -0.74 -0.52 0.86

2.53 3.28 2.61 9.38

functions in each epoch in order to get as close to the optimal price as possible. Comparing the results of both optimised membership functions, an improvement in the original system was discovered. The optimised triangular membership functions have also outperformed the optimised bell-shaped membership functions; this confirms the experts’ opinion mentioned above concerning the choice of the triangular membership functions. Table 31.2 displays the improvement of FULMAS against SVS, showing the descriptive statistics of the improvement rate of buying or selling one million. This improvement rate can be either positive, when FULMAS has outperformed SVS, or negative, when FULMAS was outperformed by SVS. In particular, we see a much higher outperformance than in the previous system, which confirms that the use of ANFIS to optimise the membership functions has increased the performance of the system on both the buy and sell sides. For example, Table 31.2 shows that on the buying side, the system, on average, outperforms the standard system by more than six basis points. On an industrial scale, this means a large amount of savings for financial institutions that employ such systems to vary the participation rates. Other descriptive statistics such as the standard deviation, skewness and kurtosis are also included. These imply that the outperformance of FULMAS over SVS is actually considerable given the higher values of the median. Also, the skewness is closer to zero, and the kurtosis has decreased in most cases, both implying a higher accuracy of the improved system.

31.4 Summary and Discussion It is well known that a main inadequacy of economic theory is that it postulates exact functional relationships between variables. In empirical financial analysis, data points rarely lie exactly on straight lines or smooth functions. Ormerod [15] suggests that attempting to accommodate these nonlinear phenomena will introduce an unacceptable level of instability in models. As a result of this

388

A. Kablan and W. L. Ng

intractability, researchers and investors are turning to artificial intelligence techniques to better inform their models, creating decision support systems that can help a human user better understand complex financial systems such as stock markets. Artificial intelligence systems in portfolio selection have been shown to have a performance edge over the human portfolio manager and recent research suggests that approaches incorporating artificial intelligence techniques are also likely to outperform classical financial models [16]. This study has introduced a system that utilises fuzzy logic in order to justify the current market condition that is produced by the accumulation of momentum. FULMAS is a fuzzy logic momentum analysis system that outperforms the traditional systems used in industry, which are often based on executing orders dependent on the weighted average of the current volume. Results of the implemented system have been displayed and compared against the traditional system. The system proves that, on average, it increases profitability on orders on both the buy and sell sides. FULMAS has been improved further by using ANFIS as an optimisation tool and the new results have shown a significant improvement over both the original FULMAS system and the SVS system. Acknowledgments The authors would like to thank Mr. Phil Hodey, the head of portfolio management and electronic trading at ICAP plc for providing the tick data used in the simulations of the system and for his invaluable support and guidance.

References 1. Ellul A, Holden CW, Jain P, Jennings RH (2007) Order dynamics: recent evidence from the NYSE. J Empirical Finance 14(5):636–661 2. Chu HH, Chen TL, Cheng CH, Huang CC (2009) Fuzzy dual-factor time-series for stock index forecasting. Expert Syst Appl 36(1):165–171 3. Dourra H, Siy P (2002) Investment using technical analysis and fuzzy logic. Fuzzy Sets Syst 127(2):221–240 4. Mamdani E, Assilian S (1975) An experiment in linguistic synthesis with a fuzzy logic controller. Int J Man Mach Stud 7(1):1–13 5. Kablan A, Ng WL (2010) High frequency trading using fuzzy momentum analysis. In: Lecture notes in engineering and computer science: proceedings of the world congress on engineering 2010, WCE 2010, vol I, 30 June–2 July, London, UK, pp 352–357 6. Jang JR (1993) ANFIS: adaptive network-based fuzzy inference system. IEEE Trans Syst Man Cybern 23(3):665–685 7. Dimitrov V, Korotkich V (2002) Fuzzy logic: a framework for the new millennium, studies in fuzziness and soft computing, vol 81. Springer, New York 8. Takagi T, Sugeno M (1985) Fuzzy identification of systems and its application to modeling and control. IEEE Trans Syst Man Cybern 15(1):116–132 9. Jang JR, Sun CT, Mizutani E (1997) Neuro-fuzzy and soft computing. Prentice Hall, Upper Saddle River 10. Atsalakis GS, Valavanis KP (2009) Forecasting stock market short-term trends using a neurofuzzy based methodology. Expert Syst Appl 36(7):10696–10707 11. Abonyi J, Babuska R, Szeifert F (2001) Fuzzy modeling with multivariate membership functions: gray box identification and control design. IEEE Trans Syst Man Cybern B 31(5):755–767

31

Optimising Order Splitting and Execution

389

12. Griffin J (2007) Do investors trade more when stocks have performed well? Evidence from 46 countries. Rev Financ Stud 20(3):905–951 13. Goldstein MA, Irvine P, Kandel E, Wiener Z (2009) Brokerage commissions and institutional trading patterns. Rev Financ Stud 22(12):5175–5212 14. Wong FS, Wang PZ (1990) A stock selection strategy using fuzzy neural networks. Neurocomputing 2(5):233–242 15. Ormerod P (2000) Butterfly economics: a new general theory of social and economic behaviour. Pantheon, New York 16. Brabazon A, O’Neill M, Maringer D (2010) Natural computing in computational finance, vol 3. Springer, Berlin

Chapter 32

The Determination of a Dynamic Cut-Off Grade for the Mining Industry P. V. Johnson, G. W. Evatt, P. W. Duck and S. D. Howell

Abstract Prior to extraction from a mine, a pit is usually divided up into 3-D ‘blocks’ which contain varying levels of estimated ore-grades. From these, the order (or ‘pathway’) of extraction is decided, and this order of extraction can remain unchanged for several years. However, because commodity prices are uncertain, once each block is extracted from the mine, the company must decide in real-time whether the ore grade is high enough to warrant processing the block further in readiness for sale, or simply to waste the block. This paper first shows how the optimal cut-off ore grade—the level below which a block should be wasted—is not simply a function of the current commodity price and the ore grade, but also a function of the ore-grades of subsequent blocks, the costs of processing, and the bounds on the rates of processing and extraction. Secondly, the paper applies a stochastic price uncertainty, and shows how to derive an efficient mathematical algorithm to calculate and operate a dynamic optimal cut-off grade criterion throughout the extraction process, allowing the mine operator to respond to future market movements. The model is applied to a real mine composed of some 60,000 blocks, and shows that an extra 10% of value can be created by implementing such an optimal regime.

P. V. Johnson (&) G. W. Evatt P. W. Duck School of Mathematics, University of Manchester, Manchester, UK e-mail: [email protected] G. W. Evatt e-mail: [email protected] P. W. Duck e-mail: [email protected] S. D. Howell Manchester Business School, University of Manchester, Manchester, UK e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_32, Ó Springer Science+Business Media B.V. 2011

391

392

P. V. Johnson et al.

32.1 Introduction Mineral mining is a complex engineering operation, which can last for several decades. As such, significant consideration must be given to the planning and design of the operation, so that numerous engineering constraints can be met, whilst making sure the operation is economically viable. To compound the difficulty of the task, the planning and scheduling of extraction from a mine is made in the presence of uncertainties, such as the future commodity price and estimated ore-grade. These uncertainties can fluctuate on a daily basis, highlighting the different timescales upon which the mining company must base decisions: the shorter time scales governed by commodity price and realised ore-grade, and the longer time-scales governed by (amongst other things) extraction rates and processing capacities. The focus of this paper is upon one of these short time-scale decisions: whether to process the extracted material, or to waste it. The level of ore-grade which separates this decision is known as the ‘cut-off grade’ [7]. Prior to extraction, the planning of the extraction schedule begins with deciding an appropriate pathway (or order) through the mine. Whilst it is possible to alter the order of extraction at various points during extraction, it is generally not a particularly flexible decision, as changing an order can require moving extraction machinery, processing units, the cancellation of contracts and large overhead costs. As such, it is reasonable to assume that the pathway through the mine is fixed, but it is how one progresses, and operates, along that pathway that is variable. At this planning stage, the mine is graphically divided up into 3-D blocks, each containing its own estimated quantity of ore. The estimated ore-grade carries with it an associated uncertainty, which can have an effect upon the valuation of a mining operation [6]. However, it is the expected (estimated) ore grade level which dominates the planning of the actual pathway through the mine, as this is the best-guess in deciding the order in which the resource should be extracted. The extraction pathway is most commonly decided using software such as the Gemcom-Whittle package [15], which allows companies to construct feasible pit shapes that satisfies slope constraints on the angle of the pit, transportation needs and work-force limitations. As previously mentioned, this algorithm may be used several times throughout a mine’s life, so as to ensure the mine plan is consistent with market conditions, however on a day-to-day basis the mine must take more detailed scheduling decisions in real-time. The key real-time decision is whether or not to process the latest extracted block (e.g. by milling or electrolysis) in readiness for sale, where the block’s intrinsic value varies with its ore grade and with the underlying commodity price. We define a ‘cost-effective’ block as one whose ore grade is high enough to pay the cash costs of processing, at the current price. However the cut-off ore grade— above which a block should be processed—need not be set as low as the grade above which the block will be cost-effective to process. Disparity between the rate of extraction and the maximum processing capacity means that there can be an opportunity cost to processing all cost-effective material, since the small

32

The Determination of a Dynamic Cut-Off Grade

393

short-term gain of processing a low grade block could be surpassed by bringing forward the processing of more valuable blocks instead. The optimal wasting of potentially cost-effective material is the focus of this paper. To highlight the above point, let us consider a trivial case where the mine has a stock of 3 blocks awaiting processing, extracted in order, A; B and C; whose current market values after processing costs are VA ¼ $1; VB ¼ $50; and VC ¼ $1; 000: Whilst, classically, analysis has often been indifferent to the order of processing, with enough discounting applied one can see that by an optimal cut-off criterion, it would be best to simply waste A and get on with processing B and C: This is because the value gained in processing A is less than the time value of money lost in waiting to process B and C at a later date. This lack of consideration of the discount rate has been highlighted before as a drawback in current mine planning [14] but, as yet, little progress has been made with it. Another consequence of an optimal cut-off grade decision is having to increase the rate of extraction of poor quality ores to keep the processing plant loaded. This is because a processing unit will typically operate at a fixed capacity, and closing (or restarting) it is a costly and undesirable operation. As such, a maximum (and minimum) possible extraction rate must be known. This clearly illustrates the link between extraction rate and the optimal cut-off grade. With this maximum possible extraction rate, one knows precisely which blocks can possibly be extracted within each period in time, and thus the decision as to which block to process next can be decided. There have been several other approaches to mine valuation and the corresponding extraction regime. Typically these have relied upon simulation methods to capture the uncertainty of price and ore-grade [8, 9, 12]. These types of method can be extremely time consuming, with computing times of several hours [3], and can often lead to sub-optimal and incomplete results. Using these simulation techniques, optimal cut-off grades were investigated by Menabde et al. [10], although little insight into the core dynamics, performance or robustness was obtained. A similar approach is the use of genetic algorithms—a general technique commonly used by computer scientists—which are capable of calculating mine schedules whilst adhering to specified constraints upon their design [11]. Whilst the work of Myburgh and Deb [11] was suitable in calculating feasible paths, the criteria by which this particular study operated was, again, not given, and the computing time was also of the order of hours. To make a step-change away from these methods, partial differential equations (PDEs) can be implemented to capture the full mine optimisation process, which builds on work by Brennan and Schwartz [2] and Chen and Forsyth [4]. The inclusion of stochastic ore-grade uncertainty, via PDEs has also been tested by Evatt et al. [6], which enabled mine valuations to be produced in under 10 s and showed that the effect on mine value of stochastic ore-grade variation is much less than the effect of stochastic price. Whilst the mathematics and numerics of this PDE approach are relatively complex at the outset, once solved, they produce highly accurate results in short times—complete with model input sensitivities. This paper extends the use of PDEs, adding a model for tactical processing

394

P. V. Johnson et al.

decisions under foreseeable variations in ore grade and unforeseeable fluctuations in price. This shows that when processing capacity is constrained, the ability to maximise the value of processing by varying the cut-off ore grade can add significantly to mine value when optimally applied. By solving rapidly under a range of processing constraints, the scale of the processing plant can itself be optimised. In Sect. 32.2 we demonstrate the underlying concepts determining the optimal cut-off decision rule, and in Sect. 32.3 we apply a price uncertainty to the model and use a contingent claims approach to derive the governing equation. We then apply the model to a mine composed of some 60,000 blocks in Sect. 32.4, to show how much extra value the running of an optimal cut-off grade regime can add to a valuation. We draw together our concluding remarks in Sect. 32.5.

32.2 Cut-Off Grade Optimisation The selection of the cut-off grade criteria reduces to whether a cost-effective block should be processed or not. This is because there is the possibility a more valuable block could be brought forward in time to be processed, which otherwise would loose more time-value of money than the value gained from processing the first block. To highlight this point let us consider the order of extracted of blocks from a mine, which we (hypothetically) place in a chronologically ordered row. As we operate the processing unit of the mine, we must pass along this row and decide which blocks to process and which blocks to waste. In reality, although we know the (estimated) ore-grades of the blocks in advance, until we know for certain the market price at the time of processing we cannot know what cashflow it will generate. Yet even if we assume a constant price, we can still show how dynamic cut-off grade decision making is still required and optimal. Consider a highly simplified mine, as shown in Fig. 32.1, which is composed of just two blocks, Block1 and Block2, with ore grades G1 and G2 ; respectively. We allow the mine to have the capacity within the rate of extraction to immediately process either the first block, Block1, or its successor, Block2. As such, the comparison is between the value of processing both blocks in order, given by V12 ; or the value of only processing Block2, V2 : With a constant price, S; we can write down the net present value of these two (already extracted) blocks, where we shall process both, V12 ¼ ðSG1 P Þ þ ðSG2 P Þerdt :

ð32:1Þ

Here dt is time it takes to process each block, P is the cost of processing each block and the discount rate is r: This value must be compared to the decision to waste the first block and process only the second block, which would have value, V2 ¼ ðSG2 P Þ:

ð32:2Þ

32

The Determination of a Dynamic Cut-Off Grade

Fig. 32.1 Two examples of how price may effect the order in which blocks are processed so as to maximise a mines NPV. Example A is made with a low commodity price, S ¼ $1; 000 kg1 ; and Example B is made with a high commodity price, S ¼ $10; 000 kg1

395

Block1 Block2 10kg

1000kg

Direction of Extraction

Potential Block Values

Example A)

$9,900

$989,950 NPV = $999,850

Waste

$999,900 NPV = $ 999,900

S=$1,000 per kg

Potential Block Values

Example B)

$99,990 $9,900,040 NPV = $10,000,030

S=$10,000 per kg Waste

$9,999,900 NPV = $9,999,900

This comparison between V12 and V2 is one the algorithm must continually make. To demonstrate how the selection depends upon the underlying price, Fig. 32.1 shows the choices available for two different commodity prices, one high (S ¼ $10; 000 kg1 ) and one low (S ¼ $1; 000 kg1 ). These are made with prescribed parameter values r ¼ 10%;

P ¼ $100 block1 ;

dt ¼ 0:1 year:

ð32:3Þ

As can be seen, in the low-price case, Example A, it is best to process only the second block. However, in the high commodity price case, namely Example B, it is best to process both blocks. This simple example demonstrates (albeit with rather exaggerated parameter values) how the selection needs to be actively taken, and how different values of the underlying price, and discount rate, will affect the optimal cut-off decision. Another consequence of this optimal decision taking is that the mine will be exhausted earlier than might have been previously thought, since we wasted the first block and only processed the second, hence a mine owner could agree a shorter lease on this particular mine.

32.3 Model Construction To create the framework for determining an optimal dynamic cut-off grade, we can make use of two distinct methods for arriving at the core equation describing the valuation, V: The first method follows a contingents claims approach, in which the

396

P. V. Johnson et al.

uncertainty arising from the underlying price is removed by hedging away the risk via short-selling suitable quantities of the underlying resource. The second method follows the Feynman–Kac probabilistic method, as described in relation to the mining industry by Evatt et al. [5], which is the chosen method for deriving a valuation when hedging is not undertaken. This second method is also permissible when hedging does take place but a slight adjustment to the price process is required, and explained within this latter paper. Because Evatt et al. [5] already covers the derivation of the mine valuation, in the present paper we explain how the contingent claims approach can be used. We first prescribe three state-space variables; these are the price per unit of the underlying resource in the ore S; the remaining amount of ore within the mine Q and time t: We next need to define the underlying price uncertainty process, which we assume to follow a geometric Brownian motion, dS ¼ lS dt þ rs S dXs ;

ð32:4Þ

where l is the drift, rs the volatility of S and the random variable dXs ; is a standard Wiener process. We use this price process without loss of generality, since other price processes (such as mean-reverting Brownian motion) can easily be implemented by the techniques described here. Using the contingent claims approach (see [16]) and the above notation, we may apply Ito’s lemma to write an incremental diffusive change in V as oV oV oV 1 2 o2 V oV dV ¼ rs dXs þ þl dt; dQ þ þ r oS oQ ot 2 s oS2 oS

ð32:5Þ

where we have taken powers of ðdtÞ2 and ðdQÞ2 to be negligible. We are able to remove the dQ term via the relationship between Q and t by specifying the rate of extraction, qe ; namely, dQ ¼ qe dt;

ð32:6Þ

where qe can be a function of all three variables, if required. This extraction rate is the function we wish to determine in our optimal cut-off regime, as it governs both how we progress through the mine and, as a consequence, which blocks we choose to waste. The rate of extraction will obviously have limitations on its operating capacity, qe 2 ½0; qmax ; which itself could be a function of time. The rate of extraction is closely linked to the rate of processing, which should be kept at a fixed constant, qp : Hence qmax must be big enough for the processing unit to always operate at its constant capacity, qp ; i.e. there must always be enough costeffective ore-bearing material being extracted from the mine so as to meet the processing capacity. Optimal variation in the extraction rate has already been shown to produce improved valuations [7], although this was achieved without considering processing limitations or grade variation.

32

The Determination of a Dynamic Cut-Off Grade

With this relationship, (32.6), Eq. 32.5 can be transformed into oV oV oV 1 o2 V oV dV ¼ r1 dXs þ dt: qe þ r2s 2 þ l oS ot oQ 2 oS oS

397

ð32:7Þ

To follow the conventional approach in creating and valuing risk-free portfolios we construct a portfolio, P; in which we are instantaneously long in (owning) the mine and short in (owing) cs amounts of commodity contracts. This defines P ¼ V cs S; such that, dP ¼ dV cs dS:

ð32:8Þ

This portfolio is designed to contain enough freedom in cs to be able to continually hedge away the uncertainty of dXs ; which is the standard approach in creating risk-free portfolios [1, 13]. It also implies that within a small time increment, dt; the value of P will increase by the risk-free rate of interest, minus any economic value generated and paid out by the mine during the increment. This economic value is typically composed of two parts, the first, negative, being the cost to extract, qe M ; and the second, positive, the cash generated by selling the resource content of the ore processed, qp ðSG P Þ: Here M is the cost of extraction per ore tonne, P is the processing cost per ore tonne, and G is the oregrade (weight of commodity per ore tonne). The reason why the economic functions contain the factors qe or qp is that we wish to maximise value by varying qe in real time, so as to maintain qp at its fixed bound. In turning the discrete block model into a continuous function describing the ore grade, G; we have assumed that blocks are small enough that they can be approximated as infinitesimal increments of volume. As discussed in Sect. 32.2, the decision whether to process or waste the next block must be optimised. Before or after optimisation the incremental change in portfolio value may be written as dP ¼ rP dt cS dS dt qp ðGS P Þ dt qe M dt:

ð32:9Þ

By setting the appropriate value of cs to be oV ; oS and substituting Eqs. (32.4), (32.7) and (32.8) into (32.9), we may write our mine valuation equation as cs ¼

1 2 2 o2 V oV oV oV þ r S qe þ ðr dÞS 2 s oS2 ot oQ oS rV þ qp ðGS P Þ qe M ¼ 0:

ð32:10Þ

This is of the same form as that derived by Brennan and Schwartz [2], except that they added taxation terms, but did not model processing constraints or variations of ore grade.

398

P. V. Johnson et al.

We next need to prescribe boundary conditions for (32.10). The boundary condition that no more profit is possible occurs either when the reserve is exhausted Q ¼ 0; or when a lease to operate the mine has reached its expiry date t ¼ T; hence: V ¼ 0 on Q ¼ 0

and/or

t ¼ T:

ð32:11Þ

Since the extraction rate will have a physical upper bound, the extraction rate and cost will not vary with S when S is large. This permits a far field condition of the form oV ! AðQ; tÞ oS

as S ! 1:

ð32:12Þ

When the underlying resource price is zero we need only solve the reduced form of Eq. 32.10 with S ¼ 0; which reduces to V¼e

rt

ZT

qe M ðzÞerz dz:

ð32:13Þ

0

This completes the determination of our core equation, and its boundary conditions. We can now define the optimising problem which we wish to solve: we must determine the optimal extraction rate, qe ; at every point in the state space which maximises the value V; which satisfies Eq. 32.10, with qe ¼ qe ; subject to the defined boundary conditions. Problems of this type may be solved numerically using finite-difference methods, in particular the semi-Lagrangian numerical technique (see [4] for further details). All results in this paper have been thoroughly tested for numerical convergence and stability. We must now show how the optimal q and its corresponding cut-off grade is to be incorporated into the maximisation procedure.

32.4 Example Valuation We now apply our optimal cut-off grade model to a real mine of some 60,000 blocks, whose block by block ore-grade and sequence of extraction were supplied by Gemcom Software International. This mine has an initial capital expenditure of some $250m. We were also supplied with a fixed reference price Sref ; for us to compare valuations with. We ourselves assumed a maximum extraction rate of five times the processing rate, which is broadly realistic, and it restricts the mine to wasting no more than 80% of any section of costeffective ore (if one can increase the extraction rate fivefold, then it is possible to waste four blocks and process the fifth). The other parameter values we were supplied are

32

The Determination of a Dynamic Cut-Off Grade

r ¼ 10% year1 ;

1

d ¼ 10% year1 ;

Sref ¼ $11; 800 kg1 ;

399

rs ¼ 30% year2 ;

P ¼ $4 tonne1 ;

Qmax ¼ 305; 000; 000 tonnes;

e ¼ $1 tonne1 ;

qp ¼ 20; 000; 000 tonnes year1 : ð32:14Þ

Whilst the ore-grade is quite volatile, it was shown in Evatt et al. [6] that a suitable average of the estimated grade quality could be used without any sizeable alteration in the valuation, as one would expect, since the same volume of ore is available sold whether one takes average values or not. Using this average, Fig. 32.2 shows the economic worth throughout extraction for each part of the mine, where we have assumed the price to remain at its prescribed reference price, Sref G P : This highlights how the grade varies through the extraction process, and it is with reference to this grade variation that we shall compare the regions where it is optimal to speed up extraction and consequently waste certain parts of the ore body.

32.4.1 Results For the example mine, we first calculate and compare two different valuations made with and without the optimal cut-off criterion. Figure 32.3 shows two sets of valuations: the lower pair (straight lines; one dashed, one solid) shows the valuations made assuming a constant price (rs ¼ 0%), and the upper pair (curved lines; one dashed, one solid) shows the effect of including both price uncertainty (rs ¼ 30%) and the option to abandon the mine when the valuation becomes negative—which is a standard option to include in a reserve valuation [2]. In each pair of lines the lower, dotted lines show valuation without a cut-off regime, and the higher, solid lines show valuation with the optimal cut-off regime. The optimal cut-off regime increases the mine valuation by up to 10%, with increasing benefit

12

Average Standardised Grade

Fig. 32.2 Given a block ordering in the mine, the average standardised grade value is the cash value of ore (against reference price) minus processing costs per tonne of ore. This data was supplied by Gemcom Software International

8

4

0

0

25

50

% of ore tonnes remaining

75

100

400

P. V. Johnson et al.

at higher prices. This may seem surprising, but although the mine is always more profitable at higher prices, the opportunity cost of not allocating the finite processing capacity to the best available block does itself grow. An obvious question which arises from this analysis is how do we decide which ore-grades we should waste, and what is the corresponding rate of extraction to achieve this? Given the mine operator will know at each point in time what the current underlying price is, they can look at the corresponding slice through the 3-D surface of the optimal cut-off grade, and see for which regions in t and Q they would waste ore and increase the rate of extraction. With this we can refer back to the corresponding grade of Fig. 32.2 and easily calculate what these grades actually are. For example, by looking at the closed regions of Fig. 32.4 we can see the optimal cut-off grades for two different commodity prices, S ¼ 100% (top) and S ¼ 200% (bottom) of the reference price. The points at where it is optimal to increase the rate of extraction is given by the segments where the closed regions (bounded by the thin line) intersect with the optimal extraction trajectory (bold line). In the two examples of Fig. 32.4, both appear to correspond to a standardised cut-off grade (Fig. 32.2) of around 2 units. The optimal rate of extraction is given by the gradient of the bold line, where the trajectory is calculated by integrating (32.6) for a given extraction regime. The difference between the dotted line (trajectory for the no cut-off situation), and the thick straight line of the optimal cut-off regime therefore gives an indication of the total amount of ore wasted. Finally, Fig. 32.5 shows how the NPV depends upon the expected expiry time for extraction if one operates an optimal cut-off regime (solid line) or not (dotted line). If the mine chooses the optimal regime, the maximum NPV occurs just after 14 years, as opposed to the life of the mine being maximal at mine exhaustion at 15 years (as it is with no cut-off). This is a consequence of an optimal cut-off grade regime, in which the mine will occasionally increase its extraction rate from the (originally) planned level due to market fluctuations, thereby reaching the final pit shape in a shorter time. 12.5 10

NPV [$100m]

Fig. 32.3 NPV of the mine against percentage of reference price for two different sets of valuations. The two lower lines (straight lines; one dashed, one solid) are for a constant price while the two upper lines (curved lines; one dashed, one solid) include price volatility and the abandonment option. NPV for the optimal cut-off regime is shown by solid lines, and no cut-off by dashed lines

7.5 5 2.5 0 -2.5 50

75

100

125

% of base commodity price

150

The Determination of a Dynamic Cut-Off Grade

401

12

t

Fig. 32.4 Graphs showing the optimal cut-off regions for an extraction project for two different price levels, medium (top), and high (bottom). The closed regions contained within the thin solid lines show where ore is wasted and the extraction rate is increased. The dashed line represents the one realisation of a trajectory followed with no cut-off, while the thick solid line represents the realisation of the trajectory followed with optimal cut-off

Time Remaining, T

32

8

4

0

0

25

50

75

100

% of ore tonnes remaining

Time Remaining, T

t

12

8

4

0

0

25

50

75

100

% of ore tonnes remaining

5

NPV [$100m]

Fig. 32.5 The NPV of the mine against time remaining on the option on the mine given that 100% of the mine is present. The solid line is with optimal cut-off, dashed without

2.5

0

-2.5

0

4

8 Expiry Date, T

12

402

P. V. Johnson et al.

32.5 Conclusions This paper has shown how to solve and optimise a (relatively) short time-scale mining problem, known as a dynamic cut-off grade, which is the continuous decision of whether to process extracted ore or not. This was achieved in the presence of price uncertainty. We have described how the partial differential equation model can be derived via two distinct methods, either by a contingent claims approach, when continuous hedging is present, or by the Feynman–Kac method. Using this model, we have shown how to determine and operate a optimal dynamic cut-off grade regime. As such, we have valued the ‘option’ to process or not to process under uncertainty, allowing the mine owner to react to future market conditions. With our given example, the option adds around 10% to the expected NPV of an actual mine of 60,000 blocks. One natural extension of this work will be to allow for the cut-off grade to remain fixed for discrete periods of time, thus allowing mine operators to not have to continually alter their rate of extraction due to market changes.

References 1. Black F (1976) The pricing of commodity contracts. J Financial Econ 3:167–179 2. Brennan MJ, Schwartz ES (1985) Evaluating natural resource investments. J Business 58(2):135–157 3. Caccetta L, Hill SP (2003) An application of branch and cut to open pit mine scheduling. J Global Optim 27:349–365 4. Chen Z, Forsyth PA (2007) A semi-Lagrangian approach for natural gas storage valuation and optimal operation. SIAM J Sci Comput 30(1):339–368 5. Evatt GW, Johnson PV, Duck PW, Howell SD, Moriarty J (2010) The expected lifetime of an extraction project. In: Proceedings of the Royal Society A, Firstcite. doi:10.1098/rspa. 2010.0247 6. Evatt GW, Johnson PV, Duck PW, Howell SD (2010) Mine valuations in the presence of a stochastic ore-grade. In: Lecture notes in engineering and computer science: proceedings of the World Congress on engineering 2010, vol III, WCE 2010, 30 June–2 July, 2010, London, UK, pp 1811–1866 7. Johnson PV, Evatt GW, Duck PW, Howell SD (2010) The derivation and impact of an optimal cut-off grade regime upon mine valuation. In: Lecture notes in engineering and computer science: proceedings of the World Congress on engineering 2010, WCE 2010, 30 June–2 July, 2010, London, UK, pp 358–364 8. Jewbali A, Dimitrakopoulos R (2009) Stochastic mine planning—example and value from integrating long- and short-term mine planning through simulated grade control. Orebody modelling and strategic mine planning, 2nd edn. The Australasian Institute of Mining and Metallurgy, Melbourne, pp 327–333 9. Martinez LA (2009) Designing, planning and evaluating a gold mine project under in-situ metal grade and metal price uncertainties. Orebody modelling and strategic mine planning, 2nd edn. The Australasian Institute of Mining and Metallurgy, Melbourne, pp 225–234 10. Menabde M, Foyland G, Stone P, Yeates GA (2004) Mining schedule optimisation for conditionally simulated orebodies. In: Proceedings of the international symposium on orebody modelling and strategic mine planning: uncertainty and risk management, pp 347–52

32

The Determination of a Dynamic Cut-Off Grade

403

11. Myburgh C, Deb K (2010) Evolutionary algorithms in large-scale open pit mine scheduling. In: Proceedings of the 12th annual conference on genetic and evolutionary computation, pp 1155–1162 12. Ramazan S, Dimitrakopoulos R (2007) Stochastic optimisation of long-term production scheduling for open pit mines with a new integer programming formulation. Orebody modelling and strategic mine planning. The Australasian Institute of Mining and Metallurgy, Melbourne, pp 385–391 13. Schwartz ES (1997) The stochastic behavior of commodity prices: implications for valuation and hedging. J Finance LII(3):923–973 14. Tolwinski B, Underwood R (1996) A scheduling algorithm for open pit mines. IMA J Math Appl Bus Ind 7:247–270 15. Whittle D, Cahill J (2001) Who plans mines? In: Strategic mine planning conference, Perth, WA, pp 15–18 16. Wilmott P, Howison S, Dewynne J (1995) The mathematics of financial derivatives. Cambridge University Press, Cambridge

Chapter 33

Improved Prediction of Financial Market Cycles with Artificial Neural Network and Markov Regime Switching David Liu and Lei Zhang

Abstract This paper provides an analysis of the Shanghai Stock Exchange Composite Index Movement Forecasting for the period 1999–2009 using two competing non-linear models, univariate Markov Regime Switching model and Artificial Neural Network Model (RBF). The experiment shows that RBF is a useful method for forecasting the regime duration of the Moving Trends of Stock Composite Index. The framework employed also proves useful for forecasting Stock Composite Index turning points. The empirical results in this paper show that ANN method is preferable to Markov-Switching model to some extent.

33.1 Introduction Many studies conclude that stock returns can be predicted by means of macroeconomic variables with an important business cycle component. Due to the fact that the change in regime should be considered as a random event

D. Liu (&) L. Zhang Department of Mathematical Sciences, Xi’an Jiaotong Liverpool University, SIP, 215123, Suzhou, China e-mail: [email protected] L. Zhang e-mail: [email protected] L. Zhang University of Liverpool, Liverpool, UK

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_33, Ó Springer Science+Business Media B.V. 2011

405

406

D. Liu and L. Zhang

and not predictable, which could motivate to analyze the Shanghai Stock Exchange Composite Index within this context. There is much empirical support that macroeconomic conditions should affect aggregate equity prices, accordingly, macroeconomic factors would be possibly used for security returns. In order to study the dynamics of the market cycles which evolved in the Shanghai Stock Exchange Market, the Composite Index is first modeled in regime switching within a univariate Markov-Switching framework (MRS). One key feature of the MRS model is to estimate the probabilities of a specific state at a time. Past research have developed the econometric methods for estimating parameters in regime-switching models, and demonstrated how regime-switching models could characterize time series behavior of some variables, which was better than the existing single-regime models. The concept about Markov Switching Regimes firstly dates back to ‘‘Microeconomic Theory: A Mathematical Approach’’ [1]. Hamilton [2] applied this model to the study of the United States business cycles and regime shifts from positive to negative growth rates in real GNP. Hamilton [2] extended Markov regime-switching models to the case of auto correlated dependent data. Hamilton and Lin also report that economic recessions are a main factor in explaining conditionally switching moments of stock market volatility [3, 4]. Similar evidences of regime switching in the volatility of stock returns have been found by Hamilton and Susmel [5], Edwards and Susmel [6], Coe [7] and [8]. Secondly, this paper deals with application of neural network method, a Radial Basis Function (RBF), on the prediction of the moving trends of the Shanghai Stock. RBFs have been employed in time series prediction with success as they can be trained to find complex relationships in the data [9]. A large number of successful applications have shown that ANN models have received considerable attention as a useful vehicle for forecasting financial variables and for time-series modeling and forecasting [10, 11]. In the early days, these studies focused on estimating the level of the return on stock price index. Current studies have reflected an interest in selecting the predictive factors as a variety of input variables to forecast stock returns by applying neural networks. Several techniques such as regression coefficients [12], autocorrelations [13], backward stepwise regression [14], and genetic algorithms [14] have been employed by researchers to perform variable subset selection [12, 13]. In addition, several researchers subjectively selected the subsets of variables based on empirical evaluations [14]. The paper is organized as follows. Section 33.2 is Data Description and Preliminary Statistics. Section 33.3 presents the research methodology. Section 33.4 presents and discusses the empirical results. The final section provides with summary and conclusion.

33

Improved Prediction of Financial Market Cycles

407

Table 33.1 Model summary Model R

R square

Adjusted R square

Std. error

1 2 3 4 5

0.427 0.597 0.695 0.763 0.800

0.422 0.590 0.688 0.755 0.791

768.26969 647.06456 564.83973 500.42457 461.69574

0.653 0.773 0.834 0.873 0.894

33.2 Data Description and Preliminary Statistics 33.2.1 Data Description This paper adopts two non-linear models, Univariate Markov Switching model and Artificial Neural Network Model with respect to the behavior of Chinese Stock Exchange Composite Index using data for the period from 1999 to 2009. As Shanghai Stock Exchange is the primary stock market in China and Shanghai A Share Composite is the main index reflection of Chinese Stock Market, this research adopts the Shanghai Composite (A Share). The data consist of daily observations of the Shanghai Stock Exchange Market general price index for the period 29 October 1999 to 31 August 2009, excluding all weekends and holidays giving a total of 2369 observations. For both the MRS and the ANN models, the series are taken in natural logarithms.

33.2.2 Preliminary Statistics In this part we will explore the relationship among Shanghai Composite and Consumer Price Index, Retail Price Index, Corporate Goods Price Index, Social Retail Goods Index, Money Supply, Consumer Confidence Index, Stock Trading by using various t-tests, and regression analysis to pick out the most relevant variables as the influence factors in our research. By using regression analysis we test the hypothesis and identify correlations between the variables. In the following multiple regression analysis we will test the following hypothesis and see whether they hold true: H0 ¼ b1 ¼ b2 ¼ b3 ¼ ¼ bK ¼ 0 H1 ¼ At least some of the b is not equal 0 ðregression insignificantÞ: In Table 33.1, R-square (R2 ) is the proportion of variance in the dependent variable (Shanghai Composite Index) which can be predicted from the independent variables. This value indicates that 80% of the variance in Shanghai Composite Index can be predicted from the variables Consumer Price Index, Retail Price

408

D. Liu and L. Zhang

Table 33.2 ANOVA Model 1

2

3

4

5

Regression Residual Total Regression Residual Total Regression Residual Total Regression Residual Total Regression Residual Total

Sum of squares

df

Mean square

F

Sig.

5.323E7 7.142E7 1.246E8 7.440E7 5.024E7 1.246E8 8.668E7 3.797E7 1.246E8 9.510E7 2.955E7 1.246E8 9.971E7 2.494E7 1.246E8

1 121 122 2 120 122 3 119 122 4 118 122 5 117 122

5.323E7 590238.317

90.178

0.000

3.720E7 418692.539

88.851

0.000

2.889E7 319043.922

90.561

0.000

2.377E7 250424.748

94.934

0.000

1.994E7 213162.957

93.549

0.000

Index, Corporate Goods Price Index, Social Retail Goods Index, Money Supply, Consumer Confidence Index, and Stock Trading. It is worth pointing out that this is an overall measure of the strength of association, and does not reflect the extent to which any particular independent variable is associated with the dependent variable. In Table 33.2, the p-value is compared to alpha level (typically 0.05). This gives the F-test which is significant as p-value = 0.000. This means that we reject the null that Stock Trading, Consumer Price Index, Consumer Confidence Index, Corporate Goods Price Index, Money Supply have no effect on Shanghai Composite. The p value (Sig.) from the F-test in ANOVA table is 0.000, which is less than 0.001, implying that we reject the null hypothesis that the regression coefficients (b’s) are all simultaneously correlated. By looking at the Sig. column in particular, we gather that Stock Trading, Consumer Price Index, Consumer Confidence Index, Corporate Goods Price Index, Money Supply are variables with p-values less than 0.02 and hence VERY significant. Then look at Fig. 33.1, the correlation numbers measure the strength and direction of the linear relationship between the dependent and independent variables. To show these correlations visually we use partial regression plots. Correlation points tend to form along a line going from the bottom left to the upper right, which is the same as saying that the correlation is positive. We conclude that Stock Trading, Consumer Price Index, Consumer Confidence Index, Corporate Goods Price Index, Money Supply and their correlation with Shanghai Composite Index is positive because the points tend to form along this line.

33

Improved Prediction of Financial Market Cycles

409

Fig. 33.1 Normal P–P plot regression standardized residual

Fig. 33.2 China CPI, CGPI and Shanghai A Share Composite Index

Due to CPI Index, CGPI Index and Money Supply Increased Ratio (M1 Increased Ratio - M2 Increased Ratio) are the most correlated influence factors with Share Composite among other factors, therefore, we choose macroeconomic indicators as mentioned by Qi and Maddala [12], CPI Index, CGPI Index and Money Supply Increased Ratio (M1 Increased Ratio - M2 Increased Ratio) as well as a data set from Shanghai Stock Exchange Market are used for the experiments to test the forecasting accuracy of RBF [12]. Typically, Figs. 33.2 and 33.3 show the developments of Shanghai Composite index with CPI, CGPI and MS along time.

33.3 Empirical Models In this section, the univariate Markov Switching Model developed by Hamilton [2] was adopted to explore regime switching of Shanghai Stock Exchange Composite Index, followed by developing an artificial neural network (ANN)—a RBF method to predict stock index moving trends. We use the RBF method to find the relationship of CPI Index, CGPI Index and Money Supply Increased Ratio with Stock Composite Index. By using the Matlab Neural Network Toolbox, RBF Network is

410

D. Liu and L. Zhang

Fig. 33.3 China money supply increased (annual basis) and Shanghai A Share Composite Index

designed in a more efficient design (newrb). Finally, the forecasting performances of these two competing non-linear models are compared.

33.3.1 Markov Regime Switching Model and Estimation 33.3.1.1 Markov Regime Switching Model The comparison of the in sample forecasts is done on the basis of the Markov Switching/Hamilton filter mathematical notation, using the Marcelo Perlin (21 June 2009 updated) forecasting modeling. A potentially useful approach to model nonlinearities in time series is to assume different behavior (structural break) in one subsample (or regime) to another. If the dates of the regimes switches are known, modeling can be worked out with dummy variables. For example, consider the following regression model: yt ¼ Xt0 bst þ et ðt ¼ 1; . . .; TÞ

ð33:1Þ

where, et NIDð0; r2st Þ; bst ¼ b0 ð1 St Þ þ b1 St , r2st ¼ r20 ð1 St Þ þ r21 St ; St ¼ 0 or 1, (Regime 0 or 1). Usually it is assumed that the possible difference between the regimes is a mean and volatility shift, but no autoregressive change. That is: yt ¼ lt St þ /ðyt1 lt St1 Þ þ et

et NIDð0; r2st Þ:

ð33:2Þ

where, lt St ¼ l0 ð1 St Þ þ l1 St . If St ðt ¼ 1; . . .; TÞ is known as a priori, then the problem is just a usual dummy variable auto-regression problem. In practice, however, the prevailing regime is not usually directly observable. Denote then PðSt ¼ j=St1 ¼ iÞ ¼ Pij ; ði; j ¼ 0; 1Þ called transition probabilities, with Pi0 þ Pi1 ¼ 1; i ¼ 0; 1. This kind of process, where the current state depends only on the state before, is called a Markov process, and the model a Markov switching model in the mean and the variance. The probabilities in a Markov process can be conveniently presented in matrix form:

33

Improved Prediction of Financial Market Cycles

PðSt ¼ 0Þ PðSt ¼ 1Þ

!

¼

p00 p01

p10 p11

411

PðSt1 ¼ 0Þ PðSt1 ¼ 1Þ

!

Estimation of the transition probabilities Pij is usually done (numerically) by maximum likelihood as follows. The conditional probability densities function for the observations yt , given the state variables, St1 and the previous observations Ft1 ¼ fyt1 ; yt2 ; . . .g is h i ½yt lt St /ðyt1 lt St1 Þ2 1 2 2rs t f ðyt =St ; St1 ; Ft1 Þ ¼ qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ exp ð33:3Þ 2 2prst et ¼ yt lt St /ðyt1 lt St1 Þ NIDð0; r2st Þ The chain rule for conditional probabilities yields then for the joint probability density function for the variables yt ; St ; St1 , given past information Ft1 , f ðyt ; St ; St1 =Ft1 Þ ¼ f ðyt =St ; St1 ; Ft1 ÞPðSt ; St1 =Ft1 Þ, such that the log-likelihood function to be maximized with respect to the unknown parameters becomes " # 1 X 1 X lt ðhÞ ¼ log f ðyt =St ; St1 ; Ft1 ÞPðSt ; St1 =Ft1 Þ ð33:4Þ St ¼0 St1 ¼0 h ¼ ðp; q; /; l0 ; l1 ; r20 ; r21 Þ and the transition probabilities: p ¼ PðSt ¼ 0=St1 ¼ 0Þ and q ¼ PðSt ¼ 1=St1 ¼ 1Þ. Steady state probabilities PðS0 ¼ 1=F0 Þ and PðS0 ¼ 0=F0 Þ are called the steady state probabilities, and, given the transition probabilities p and q are obtained as: PðS0 ¼ 1=F0 Þ ¼

1p ; 2qp

PðS0 ¼ 0=F0 Þ ¼

1q : 2qp

33.3.1.2 Stock Composite Index Moving Trends Estimation In our case, we have three explanatory variables X1t ; X2t ; X3t in a Gaussian framework (Normal distribution) and the input argument S, which is equal to S ¼ ½1111, then the model for the mean equation is: yt ¼ X1t b1;St þ X2t b2;St þ X3t b3;St þ et

et NIDð0; r2st Þ

ð33:5Þ

where, St represents the state at time t, that is, St ¼ 1; . . .; K (K is the number of states); r2st is Error variance at state St ; bSt is beta coefficient for explanatory variable i at state St , where i goes from 1 to n; et is residual vector which follows a particular distribution (in this case Normal). With this change in the input argument S, the coefficients and the model’s variance are switching according to the transition probabilities. Therefore, the logic is clear: the first elements of input argument S control the switching dynamic

412

D. Liu and L. Zhang

of the mean equation, while the last terms control the switching dynamic of the residual vector, including distribution parameters. Based on Gaussian maximum likelihood, the equations are represented as following: State 1 (= 1), yt ¼ X1t b1;1 þ X2t b2;1 þ X3t b3;1 þ et ; State 2 (= 2), p11 p21 yt ¼ X1t b1;2 þ X2t b2;2 þ X3t b3;2 þ et . With as the transition matrix, p12 p22 which controls the probability of a regime switch from state j (column j) to state i (row i). The sum of each column in P is equal to one, since they represent full probabilities of the process for each state.

33.3.2 Radial Basis Function Neural Networks The specific type of ANN employed in this study is the Radial Basis Function (RBF), the most widely used among the many types of neural networks. RBFs were first used to solve the interpolation problem-fitting a curve exactly through a set of points. Fausett defines radial basis functions as ‘‘activation functions with a local field of response at the output’’ [15]. The RBF neural networks are trained to generate both time series forecasts and certainty factors. The RBF neural network is composed of three layers of nodes. The first is the input layer that feeds the input data to each of the nodes in the second or hidden layer. The second layer of nodes differs greatly from other neural networks in that each node represents a data cluster which is centered at a particular point and has a given radius. The third and final layer consists of only one node. It acts to sum the outputs of the second layer of nodes to yield the decision value [16]. sﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ n P q The ith neurons input of a hidden layer is ki ¼ ðW1ji Xjq Þ2 b1i and j¼1

output is: 0vﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 1 uX n u riq ¼ expððkiq Þ2 Þ ¼ [email protected] ðW1ji Xjq Þ2 b1i A j¼1

2 q ¼ exp W1i Xj b1i where, b1i presents threshold value, Xj is the input feature vector and the approximant output riq is differentiable with respect to the weights W1i . When an input vector is fed into each node of the hidden layer simultaneously, each node then calculates the distance from the input vector to its own center. That distance value is transformed via some function, and the result is output from the node. That value output from the hidden layer node is multiplied by a constant or

33

Improved Prediction of Financial Market Cycles

413

weighting value. That product is fed into the third layer node which sums all the products and any numeric constant inputs. Lastly, the third layer node outputs the decision value. A Gaussian basis function for the hidden units given as Zj for j ¼ 1; . . .; J, where 2 ! X lj Zj ¼ exp : 2r2 lj and rj are mean and the standard deviation respectively, of the jth unit receptive field and the norm is Euclidean. In order to obtain the tendency of A Share Composite Index, we examine the sample performance of quarterly returns (totally 40 quarters) forecasts for the Shanghai Stock Exchange Market from October 1999 to August 2009, using three exogenous macroeconomic variables, the CPI, CGPI and Money Supply (M1–M2, Increased on annual basis) as the inputs to the model. We use a Radial Basis Function network based on the learning algorithm presented above. Using the Matlab Neural Network Toolbox, the RBF network is created using an efficient design (newrb). According to Hagan et al. [17], a small spread constant can result in a steep radial basis curve while a large spread constant results in a smooth radial basis curve; therefore it is better to force a small number of neurons to respond to an input. Our interest goes to obtain a single consensus forecast output, the sign of the prediction only, which will be compared to the real sign of the prediction variable. After several tests and changes to the spread, at last we find spread = 4 is quite satisfied for out test. As a good starting value for the spread constant is between 2 and 8 [17], we set the first nine columns of y0 as the test samples [17].

33.4 Empirical Results 33.4.1 Stock Composite Index Moving Trends Estimation by MRS Table 33.3 shows the estimated coefficients of the proposed MRS along with the necessary test statistics for evaluation of Stock Composite Index Moving Trends. The Likelihood Ratio test for the null hypothesis of linearity is statistically significant and this suggests that the linearity is strongly rejected. The results in Table 33.3 further highlight several other points: First, value of the switching variable at state 1 is 0.7506, at state 2 value of the switching variable is -0.0161; and the model’s standard deviation r takes the values of 0.0893 and 0.0688 for regime 1 and regime 2 respectively; these values help us to identify regime 1 as the upward regime and regime 2 as the downward regime. Second, the duration measure shows that the upward regime lasts approximately 57 months, whereas the high volatility regime lasts approximately 24 months.

414

D. Liu and L. Zhang

Table 33.3 Stock index moving trends estimation by MRS Parameters

Estimate

Std err

l0 l2 r20 r22 Expected duration

0.7506 -0.0161 0.0893 0.0688 56.98 time periods

0.0866 0.0627 0.0078 0.0076 23.58 time periods

Transition probabilities p (regime1) q (regime0) Final log likelihood

0.98 0.96 119.9846

Fig. 33.4 Smoothed states probabilities (moving trends)

As we use the quarterly data for estimating the Moving Trends, the smoothed probabilities and filtered state probabilities lines seem exiguous. Figure 33.4 reveals the resulting smoothed probabilities of being in up and down moving trends regimes along Shanghai Stock Exchange Market general price index. Moreover, filtered States Probabilities is shown in Fig. 33.5, several periods of the sample are characterized by moving downwards associated with the presence of a rational bubble in the capital market of China from 1999 to 2009.

33.4.2 Radial Basis Function Neural Networks Interestingly, the best results we obtained from RBF training are 100% correct approximations of the sign of the test set, and 90% of the series on the training set. This conclusion on one hand is consensus with the discovery in ‘‘the Stock Market and the Business Cycle’’ by Hamilton and Lin [3]. Hamilton and Lin [3] argued that the analysis of macroeconomic fundamentals was certainly a satisfactory explanation for stock volatility. To our best knowledge, the fluctuations in the

33

Improved Prediction of Financial Market Cycles

415

Fig. 33.5 Filtered states probabilities (moving trends)

Table 33.4 RBF training output

x

y0

T

x

y0

T

0.80937 0.30922 0.96807 1.0459 -0.011928 0.92 0.81828 0.054912 0.34783 0.80987 1.1605 0.66608 0.22703 0.45323 0.69459 0.16862 0.83891 0.61556 1.0808 -0.089779

1 0 1 1 0 1 1 0 0 1 1 1 0 0 1 0 1 1 1 0

1 0 1 1 0 1 1 0 0 1 1 1 0 0 1 0 1 1 1 0

0.031984 0.80774 0.68064 0.74969 0.54251 0.91874 0.50662 0.44189 0.59748 0.69514 1.0795 0.16416 0.97289 -0.1197 0.028258 0.087562 -0.084324 1.0243 0.98467 0.0032105

0 1 1 1 1 1 1 0 1 1 1 0 1 0 0 0 0 1 1 0

0 0 0 0 1 1 1 1 1 1 1 0 1 0 0 0 0 1 1 0

level of macroeconomic variables such as CPI and CGPI and other economic activity are a key determinant of the level of stock returns [18]. On the other hand, in a related application, also showed that RBFs have the ‘‘best’’ approximation property-there is always a choice for the parameters that is better than any other possible choice—a property that is not shared by MLPs. Due to the Normal Distributions intervals, outputs is y0 ¼ FðXÞ, FðXÞ ¼ 1 if X 0:5, FðXÞ ¼ 0 if X\0:5. Table 33.4 gives the results of the outputs. From x we could know that the duration of regime 1 is 24 quarters and regime 0 is 16 quarters. The comparisons of MRS and RBF models can be seen in Table 33.5. It is clear that the RBF model outperforms the MRS model on the regime duration estimation.

416 Table 33.5 Regime comparison of stock index moving trends

D. Liu and L. Zhang Model

Regime 1 (months) Regime 0 (months)

Observed durations 66 Markov-switching 57 Radial basis function 72

54 24 48

33.5 Conclusion and Future Work In this chapter, we compared the forecasting performance of two nonlinear models to address issues with respect to the behaviors of aggregate stock returns of Chinese Stock Market. Rigorous comparisons between the two nonlinear estimation methods have been made. From the Markov-Regime Switching model, it can be concluded that real output growth is subject to abrupt changes in the mean associated with economy states. On the other hand, the ANN method developed with the prediction algorithm to obtain abnormal stock returns, indicates that stock returns should take into account the level of the influence generated by macroeconomic variables. Further study will concentrate on prediction of market volatility using this research framework. Acknowledgments This work was supported by the Pilot Funds (2009) from Suzhou Municipal Government (Singapore Industrial Park and Higher Educational Town—SIPEDI) for XJTLU Lab for Research in Financial Mathematics and Computing.

References 1. Henderson JM, Richard E (1958) Quandt, micro-economic theory: a mathematical approach. McGraw-Hill, New York 2. Hamilton JD (1989) A new approach to the economic analysis of nonstationary time series and the business cycle. Econometrica 57(2):357–384 3. Hamilton JD, Lin G (1996) Stock market volatility and the business cycle. J Appl Econom 11(5):573–593 4. Hamilton JD (1996) Specification tests in Markov-switching Time-series models. J Econom 70(1):127–157 5. Hamilton JD, Susmel R (1994) Autoregressive conditional heteroskedasticity and changes in regime. J Econom 64(1–2):307–333 6. Edwards S, Susmel R (2001) Volatility dependence and contagion in emerging equities markets. J Dev Econ 66(2):505–532 7. Coe PJ (2002) Financial crisis and the great depression: a regime switching approach. J Money Credit Bank 34(1):76–93 8. Hamilton JD (1994) Time series analysis. Princeton University Press, Princeton 9. Chen S, Cowan CFN, Grant PM (1991) Orthogonal least squares learning algorithm for radial basis function network. IEEE Trans Neural Netw 2(2):302–309 10. Swanson N, White H (1995) A model selection approach to assessing the information in the term structure using linear models and artificial neural networks. J Bus Econ Stat 13(8): 265–275 11. Zhang G, Patuwo BE, Hu MY (1998) Forecasting with artificial neural networks: the state of the art. Int J Forecast 14(1):35–62

33

Improved Prediction of Financial Market Cycles

417

12. Qi M, Maddala GS (1999) Economic factors and the stock market: a new perspective. J Forecast 18(3):151–166 13. Desai VS, Bharati R (1998) The efficiency of neural networks in predicting returns on stock and bond indices. Decis Sci 29(2):405–425 14. Motiwalla L, Wahab M (2000) Predictable variation and profitable trading of US equities: a trading simulation using neural networks. Comput Oper Res 27(11–12):1111–1129 15. Fausett L (1994) Fundamentals of neural networks: architectures, algorithms and applications. Prentice-Hall, Upper Saddle River 16. Moody J, Darken C (1989) Fast learning in networks of locally tuned processing units. Neural Comput 1(2):281–294 17. Hagan MT, Demuth HB, Beale MH (1996) Neural network design. PWS Publishing, Boston 18. Liu D, Zhang L (2010) China stock market regimes prediction with artificial neural network and markov regime switching. In: Lecture notes in engineering and computer science: proceeding of the world congress on engineering 2010, WCE 2010, 30 June–2 July, 2010 London, UK, pp 378–383

Chapter 34

Fund of Hedge Funds Portfolio Optimisation Using a Global Optimisation Algorithm Bernard Minsky, M. Obradovic, Q. Tang and Rishi Thapar

Abstract Portfolio optimisation for a Fund of Hedge Funds (‘‘FoHF’’) has to address the asymmetric, non-Gaussian nature of the underlying returns distributions. Furthermore, the objective functions and constraints are not necessarily convex or even smooth. Therefore traditional portfolio optimisation methods such as mean–variance optimisation are not appropriate for such problems and global search optimisation algorithms could serve better to address such problems. Also, in implementing such an approach the goal is to incorporate information as to the future expected outcomes to determine the optimised portfolio rather than optimise a portfolio on historic performance. In this paper, we consider the suitability of global search optimisation algorithms applied to FoHF portfolios, and using one of these algorithms to construct an optimal portfolio of investable hedge fund indices given forecast views of the future and our confidence in such views.

B. Minsky (&) R. Thapar International Asset Management Ltd., 7 Clifford Street, London, W1S 2FT, UK e-mail: [email protected] R. Thapar e-mail: [email protected] M. Obradovic Q. Tang School of Mathematical and Physical Sciences, Sussex University, Brighton, BN1 9RF, UK e-mail: [email protected] Q. Tang e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_34, Springer Science+Business Media B.V. 2011

419

420

B. Minsky et al.

34.1 Introduction The motivation for this paper was to develop a more robust approach to constructing portfolios of hedge fund investments that takes account of the issues that confront portfolio managers: 1. 2. 3. 4.

The The The The

non-Gaussian, asymmetric nature of hedge fund returns; tendency of optimisation algorithms to find corner solutions; speed in computation and efficiency in finding the solution; and desire to incorporate forecast views into the problem specification.

We describe here how each of these issues was addressed and illustrate with reference to the optimising of a portfolio of investable hedge fund indices. This paper synthesises a review of the applicability of global search optimisation algorithms for financial portfolio optimisation with the development of a Monte Carlo simulation approach to forecasting hedge fund returns and implementing the methodology into an integrated forecasting and optimisation application. In Sect. 34.2, we summarise the review of global search optimisation algorithms and their applicability to the FoHF portfolio optimisation problem. In Sect. 34.3, we describe the Monte Carlo simulation technique adopted using resampled historical returns data of hedge fund managers and also how we incorporated forecast views and confidence levels, expressed as probability outcomes, into our returns distribution data. In Sect. 34.4, we report the results of applying the methodology to a FoHF portfolio optimisation problem and in Sect. 34.5, we draw our conclusions from the study.

34.2 Review of Global Search Optimisation Algorithms The FoHF portfolio optimisation problem is an example of the typical minimisation problem in finance: f ðxÞ min gðxÞð\Þ ¼ g0 .. . hðxÞð\Þ ¼ h0 where f is non-convex and maybe non-smooth, called the objective function. The g, …, h are constraint functions, with g0, …, h0 as minimum thresholds. The variable x usually denotes the weights assigned to each asset and the constraints will usually include the buying and shorting limits on each asset. It is well known that many of the objective functions and constraints specified in financial minimisation problems are not differentiable. Traditional asset management has relied on the Markowitz specification as a mean–variance optimisation problem which is soluble by classical optimisation methods. However,

34

Fund of Hedge Funds Portfolio Optimisation

421

in FoHF portfolio optimisation the distribution of hedge fund returns are nonGaussian and the typical objective functions and constraints are not limited to simple mean, variance and higher order moments of the distribution. We have previously [1] discussed the use of performance and risk statistics such as maximum drawdown, downside deviation, co-drawdown, and omega as potential objective functions and constraint functions which are not obviously differentiable. With the ready availability of powerful computing abilities and less demand on smoothness, it is possible to look for global optimisation algorithms which do not require regularity of the objective (constraint) functions to solve the financial minimisation problem. In our review of the literature [1], we found that there are three main ideas of global optimisation; Direct, Genetic Algorithm, and Simulated Annealing. In addition, there are a number of other methods which are derived from one or more of the ideas listed above. A key characteristic of fund of fund portfolio optimisation, in common with other portfolio optimisation problems, is that the dimensionality of the problem space is large. Typically, a portfolio of hedge funds will have between 20 and 40 assets with some commingled funds having significantly more assets. This means that the search algorithm cannot conduct an exhaustive test of the whole space efficiently. For example, if we have a portfolio of 40 assets we have a 40-dimensional space, and an initial grid of 100 points on each axis produces 1040 initial test points to evaluate the region where the global minimum might be found. This would require considerable computing power and would not be readily feasible. Each of the methods we considered in our review requires an initial search set. The choice of the initial search set is important as the quality of the set impacts the workload required to find the global minimum. The actual approach to moving from the initial set to finding better and better solutions differs across the methods and our search also revealed some approaches that combine the methods to produce a hybrid algorithm. In the paper [1] we evaluated seven algorithms across the methods to identify which method and specific algorithm was best suited to our FoHF portfolio optimisation problem. The algorithms considered are described here.

34.2.1 PGSL: Probabilistic Global Search Lausanne PGSL is a hybrid algorithm, proposed by Raphael [2], drawing on the Simulated Annealing method that adapts its search grid to concentrate on regions in the search space that are favourable and to intensify the density of sampling in these attractive regions. The search space is sampled using a probability distribution function for each axis of the multi-dimensional search space. At the outset of the search process, the probability distribution function is a uniform distribution with intervals of

422

B. Minsky et al.

constant width. During the process, a probability distribution function is updated by increasing probability and decreasing the width of intervals of the regions with good functional values. A focusing algorithm is used to progressively narrow the search space by changing the minimum and maximum of each dimension of the search space.

34.2.2 MCS: Multi-Level Co-ordinate Search MCS belongs to the family of branch and bound methods and it seeks to solve bound constrained optimisation problems by combining global search (by partitioning the search space into smaller boxes) and local search (by partitioning subboxes based on desired functional values). In this way, the search is focused in favour of sub-boxes where low functional values are expected. The balance between global and local parts of the search is obtained using a multi-level approach. The sub-boxes are assigned a level, which is a measure of how many times a sub-box has processed. The global search part of the optimisation process starts with the sub-boxes that have low level values. At each level, the box with lowest functional value determines the local search process. The optimisation method is described in the paper by Huyer and Neumaier [3]. Some of the finance papers that have examined MCS include Aggregating Risk Capital [4] and Optimising Omega [5]. In Optimising Omega [5] Value-atRisk of a portfolio is calculated using marginal distributions of the risk factors and MCS is employed to search for the best-possible lower bound on the joint distribution of marginal distributions of the risk factors. Optimising Omega [5] uses MCS to optimise for Omega ratio, a non-smooth performance measure, of a portfolio.

34.2.3 MATLAB Direct The Direct Search algorithm, available in MATLAB’s Genetic Algorithm and Direct Search Toolbox, uses a pattern search methodology for solving bound linear or non-linear optimisation problems [6]. The algorithms used are Generalised Pattern Search (GPS) and Mesh Adaptive Search (MADS) algorithm. The pattern search algorithm generates a set of search directions or search points to approach an optimal point. Around each search point, an area, called a mesh, is formed by adding the current point to a scalar multiple of a set of vectors called a pattern. If a point in the mesh is found that improves the objective function at the current point, the new point becomes the current point for the next step and so on. The GPS method uses fixed direction vectors and MADS uses random vectors to define a mesh.

34

Fund of Hedge Funds Portfolio Optimisation

423

34.2.4 MATLAB Simulated Annealing The Simulated Annealing method uses probabilistic search algorithm models that model the physical process of heating a material and then slowly lowering the temperature to decrease defects, thus minimising the system energy [6]. By analogy with this physical process, each step in the Simulated Annealing algorithm replaces the current point by another point that is chosen depending on the difference between the functional values at the two points and the temperature variable, which is systematically decreased during the process.

34.2.5 MATLAB Genetic Algorithm The MATLAB’s Genetic Algorithm is based on the principles of natural selection and uses the idea of mutation to produce new points in the search for an optimised solution [6]. At each step, the Genetic Algorithm selects individuals at random from the current population to be parents and uses them to produce the children for the next generation. In this way, the population evolves toward an optimal solution.

34.2.6 TOMLAB LGO Tomlab’s Global Optimiser, TOMLAB/LGO, combines global and local search methodologies [7]. The global search is implemented using the branch and bound method and adaptive random search. The local search is implemented using a generalised reduced gradient algorithm.

34.2.7 NAG Global Optimiser NAG’s Global Optimiser, E05JBF, is based on MCS, as described above. E05JBF is described in NAG’s Library Routine Document [8] and Optimising Omega [5]. The above algorithms were evaluated on the three constrained optimisation problems. The constraints consisted of both linear constraints on the allocation weights to the assets and constraints on the level of functions that characterise the portfolio’s performance or risk. The algorithms were measured regarding time to run, percentage of corners in the optimal solution, and the deviation from the average optimal solution. A simple scoring rule combining these three factors as a weighted sum was constructed. There was considerable variation in relative performance between the algorithms across the different tests. Two algorithms, MATLAB Annealing and

424

B. Minsky et al.

MATLAB Genetics, were found to be unstable giving rise to different results when repeated runs of the same problem and environment were performed. They also produced widely different results, from very good to very bad, across the tests and were rejected from consideration easily. The other five algorithms all produced acceptable results with MATLAB Direct scoring best across the constrained optimisation examples. PGSL, the adaptive Simulated Annealing algorithm performs reasonably in most tests and has been used by IAM for the past 4 years. Therefore, we chose to compare MATLAB Direct with PGSL in our portfolio optimisation implementation.

34.3 Implementing the Global Search Optimisation Algorithm Traditional optimisation of portfolios has focused on determining the optimal portfolio given the history of asset returns and assuming that the distribution of returns is Gaussian and stationary over time. Our experience is that these assumptions do not hold and that any optimisation should use the best forecast we can make of the horizon for which the portfolio is being optimised. When investing in hedge funds, liquidity terms are quite onerous with lock ups and redemption terms from monthly to annual frequencies, and notice periods ranging from a few days to 6 months. This means that the investment horizon tends to be 6–12 months ahead to reflect the minimum time any investment will be in a portfolio. The forecast performance of the assets within the portfolio is produced using Monte Carlo simulation and re-sampling. The objective is to produce a random sample of likely outcomes period by period for the forecast horizon based on the empirical distributions observed for the assets modified by our views as to the likely performance of the individual assets. This is clearly a non-trivial exercise, further complicated by our wish to maintain the relationship between the asset distributions and any embedded serial correlation within the individual asset distribution. The approach implemented has three components: 1. Constructing a joint distribution of the asset returns from which to sample; 2. Simulating the returns of the assets over the forecast horizon; and 3. Calculating the relevant objective function and constraints for the optimisation.

34.3.1 Constructing the Joint Distribution of Asset Returns We used bootstrapping in a Monte-Carlo simulation framework to produce the distribution of future portfolio returns. Bootstrapping is a means of using the available data by resampling with replacement. This generates a richer sample than would otherwise be available. To preserve the relationship between the assets we

34

Fund of Hedge Funds Portfolio Optimisation

425

treat the set of returns for the assets in a time period as an observation of the joint distribution of the asset returns. An enhancement to this sampling scheme to capture any serial correlation is to block sample a group contiguously, say three periods together. Block sampling of three periods at a time offers around 10 million distinct samples of blocks of three time periods. As we used bootstrapping to sample from the distribution and we wished to preserve the characteristics of the joint distribution, we needed to define a time range over which we have returns for all of the assets in the portfolio. Hedge funds report returns generally on a monthly basis, which means that we needed to go back a reasonable period of time to obtain a sufficiently large number of observations to enable the bootstrap sampling to be effective. For hedge funds this is complicated because many of the funds have not been in existence for very long, with the median life of a hedge fund being approximately 3 years. Although the longer the range that can be used for the joint distribution the greater the number of points available for sampling, the lack of stationarity within the distribution leads us to select a compromise period, typically 5 years, as the desired range. Where a hedge fund does not have a complete 5 year history, we employed a backfill methodology to provide the missing data. There are a number of approaches to backfilling asset return time series such as selecting a proxy asset to fill the series; using a strategy index with a random noise component; constructing a factor model of the asset returns from the available history and using the factor return history and model to backfill; or to randomly select an asset from a set of candidate assets that could have been chosen for the portfolio for the periods that the actual asset did not exist. We adopted this last method, selecting an asset from a set of available candidates within a peer group for the missing asset. Where the range for which returns are missing was long, we repeated the exercise of selecting an asset at random from the available candidates within the peer group, say, every six periods. Our reasoning for applying this approach is that we assume as portfolio managers, given the strategy allocation of the portfolio, that we would have chosen an asset from the candidate peer group available at that time to complete the portfolio. Using this process we constructed a complete set of returns for each of the assets going back, say, 5 years. The quality of the backfill depends on how narrowly defined the candidate peer group is defined. At International Asset Management Limited (IAM), we have defined our internal set of strategy peer groups that reflect best our own interpretation of the strategies in which we invest. This is because hedge fund classifications adopted by most of the index providers tend to be broad, and can include funds that would not feature in IAM’s classifications.

34.3.2 Simulating Returns Over the Forecast Horizon We simulated the returns of the assets using a block bootstrap of the empirical joint distributions, which are modified by probabilistically shifting the expected

426

B. Minsky et al. Impact of Mixing Views

14% Pessimistic Most Likely Optimistic

12%

Mixed

10%

8%

6%

4%

2%

0% -15.0%

-10.0%

-5.0%

0.0%

5.0%

10.0%

15.0%

20.0%

25.0%

30.0%

35.0%

Fig. 34.1 Probabilistic shifting of expected mean

return of the sample according to our assessment of the likely return outcomes for the assets. First we describe the process of incorporating forecast views by expectation shifting and then we describe the block bootstrapping method. The desire to include forecast views, expressed as expected annual returns, and confidence, expressed as probabilities, within a portfolio optimisation problem has been addressed in a number of ways. Black and Litterman developed an approach where the modeller expressed a view as to the expected mean of a returns series and attached a confidence to each view. This approach is Bayesian and allows the traditional Mean–Variance approach to be adapted to allow for more stable and intuitive allocations which do not favour corner solutions. However, we have chosen an empirical approach, of mixing probabilistically mean shifted versions of the empirical distribution, to include views that allows a range of outcomes to be specified with a confidence associated with the views. Figure 34.1 shows how applying a probabilistic shift to the mean of a distribution not only repositions the distribution but changes the higher order moments as the spread, skew and kurtosis all change. In Table 34.1 the forecast views for a number of strategies are set out with associated confidence. The optimistic, pessimistic and most likely views are the best assessment of the potential expected return of the mean fund within the strategy. The confidence level represents the likelihood of that view prevailing. We note the sum of the three confidence levels is one. We use these likelihoods to determine for each asset, according to its strategy, which shift should be applied to the distribution for that simulation. This is implemented by simply sampling from the uniform distribution and dividing the distribution into three segments according to the confidence levels associated with the three views. Recognising that each asset does not track its strategy with certainty we calculate the beta for

34

Fund of Hedge Funds Portfolio Optimisation

427

Table 34.1 Forecast views and confidence by hedge fund strategy Strategy Opt. view Conf./prob. Pess. view Conf./prob. Most likely (%) (%) (%) (%) view (%) Convertible bond arbitrage Credit Event driven Fixed income rel val

Conf./prob. (%)

17.5

25

7.5

25

12.5

50

17.5 10.0 15.0

25 25 25

7.5 0.0 10.0

25 25 25

12.5 5.0 12.5

50 50 50

the asset with respect to the strategy and adjust the return by the randomly chosen shift (‘‘k’’) multiplied by the asset beta calculated. So the return in any period (‘‘t’’) for an asset (‘‘a’’) which follows strategy (‘‘s’’) for simulation trial (‘‘n’’) is: m ra;t;n ¼ raw ra;t;n þ bx shifts;k

34.3.3 Calculating the Objective Function and Constraint Functions In implementing the bootstrapped Monte Carlo simulation we simulate 500 trials or scenarios for the assets in the portfolio. This produces a distribution of returns of each asset and the distributions of any statistics we may wish to compute. Our objective and constraint functions are statistics based on the distribution of portfolio returns. With a set of asset allocation weights, the distribution of portfolio returns and statistics distributions may be calculated. It is worth discussing how we use this information within the optimisation algorithm. To do this we shall use as an example maximising expected return subject to a maximum level of maximum drawdown. As we have chosen to optimise expected return, our objective function is simply the median of the distribution of portfolio returns. If we set our objective to ensure performance is at an acceptable level in most circumstances we might choose the bottom five percentile of return as the objective function so as to maximise the least likely (defined as fifth percentile) return. This reflects the flexibility we have with using a simulated distribution as the data input into the optimisation process. In PGSL, as with almost all of the global search optimisation algorithms, both the linear and non-linear constraints are defined as penalty functions added to the objective function and hence are soft constraints rather than hard constraints that must be satisfied. The weight attached to each penalty function determines how acceptable a constraint violation is. In our example, we define the penalty function as the average of the maximum drawdown for the lowest five percentile of the maximum drawdown distribution less the constraint boundary assuming the conditional average exceeds the constraint level multiplied by an importance factor:

428

B. Minsky et al.

Table 34.2 FoHF portfolio optimisation problem Objective Maximise median portfolio return Subject to: Maximum drawdown over forecast period Total allocations for full investment Cash Within the following constraints RBC Hedge 250 Equity Market Neutral RBC Hedge 250 Equity Long/Short Directional All Long/Short Equity RBC Hedge 250 Fixed Income Arbitrage RBC Hedge 250 Macro RBC Hedge 250 Managed Futures RBC Hedge 250 Credit RBC Hedge 250 Mergers and Special Situations RBC Hedge 250 Multi-Strategy

Less than 5% 100% 10% Between Between Between Between Between Between Between Between Between

10 and 16% 14 and 20% 24 and 36% 7 and 13% 10 and 20% 10 and 20% 5 and 15% 0 and 10% 0 and 10%

Max dd penalty ¼ MaxðConstraint dd averageðMax ddnjLower 5%ileÞ; 0Þ= No: of Trials Importance

This measure is analogous to an expected tail loss or Conditional VaR (CVar) in that it is an estimate of the conditional expectation of the maximum drawdown for the lower tail of the distribution of drawdowns.

34.4 Results of Optimising a FoHF Portfolio The approach to optimising a FoHF portfolio has been implemented in MATLAB and applied to a portfolio of eight RBC Hedge 250 hedge fund strategy indices. The monthly returns for indices from July 2005 are available from the RBC website. As the simulation requires 5 years of monthly returns the series were backfilled from the IAM’s pre-determined group of candidate assets within the relevant investment strategy peer group, using random selection as previously described. The portfolio was optimised with an objective function to maximise median returns subject to constraints on the maximum and minimum allocations to each asset, a constraint on the maximum and minimum allocation to Long/Short Equity strategies and a maximum allowable maximum drawdown of 5% over the forecast horizon. Thus the optimisation problem is as set out in Table 34.2. First we noted that the total allocations satisfying the equality constraint of all capital is deployed with both PGSL and MATLAB Direct and that all the asset allocation constraints are satisfied including the constraint on all Long/Short Equity strategies by MATLAB Direct, but not by PGSL. Secondly we noted that

34

Fund of Hedge Funds Portfolio Optimisation

429

Table 34.3 Optimal allocations and results Asset Lower bound Upper bound (%) (%) Cash RBC Hedge 250 Equity Market Neutral RBC Hedge 250 Equity Long/ Short RBC Hedge 250 Fixed Income Arbitrage RBC Hedge 250 Macro RBC Hedge 250 Managed Futures RBC Hedge 250 Credit RBC Hedge 250 Mergers and Sp.Situations RBC Hedge 250 Multi-Strategy Median return Excess Tail Maximum Drawdown a

Naïve (%)

PGSL (%)

Direct (%)

10 10

10 16

10.0 13.0

10.0 15.0

10.0 16.0

14

20

17.0

16.1

20.0

7

13

10.0

12.5

13.0

10 10 5 0

20 20 15 10

15.0 15.0 10.0 5.0

12.7 12.7 16.7a 0.6

13.9 15.4 11.5 0.0

0

10 – –

– –

5.0 7.40 2.70

5.8 7.80 3.22

0.1 7.96 1.71

In breach of upper allocation constraint

with PGSL only one other allocation is near its lower or upper bounds whereas with MATLAB Direct five allocations are at or near either the lower or upper bounds. Thirdly we compared the results to a portfolio where the allocation of capital to the different assets was chosen to be the midpoint between the lower and upper bounds placed on each asset (the naïve allocation). We noted that both optimisers improved median returns (7.8 and 8.0% vs. 7.40%) and that MATLAB Direct reduced the breach of the maximum drawdown constraint (1.71% vs. 2.70%) whereas the PGSL optimisation failed to improve on this condition (3.22% vs. 2.70%). MATLAB Direct portfolio had the better maximum drawdown distribution both in terms of worst case and general performance. Also, MATLAB Direct optimised portfolio performs the best of the three portfolios in terms of cumulative returns. Finally, we noted that PGSL optimisation terminated on maximum iterations and this might explain why it failed to meet all the allocation criteria (Table 34.3).

34.5 Conclusion The review of Global Search Optimisation algorithms showed that there is a range of methods available, but their relative performance is variable. The specifics of the problem and initial conditions can impact the results significantly. In applying MATLAB Direct and PGSL to the FoHF portfolio optimisation problem, we observed that we improved on the naïve solution in both cases, but each method

430

B. Minsky et al.

presented solution characteristics that might be less desirable. PGSL was unable to find a solution that met its threshold stopping criterion whilst MATLAB Direct found a solution with many corner points. Further research studies are required to evaluate the stability of the optimiser outputs and sensitivity analysis of salient optimisation parameters.

References 1. Minsky B, Obradovic M, Tang Q, Thapar R (2008) Global optimisation algorithms for financial portfolio optimisation. Working paper, University of Sussex 2. Raphael B, Smith IFC (2003) A direct stochastic algorithm for global search. Appl Math Comput 146:729–758 3. Huyer W, Neumaier A (1999) Global optimisation by multilevel coordinate search. J Global Optim 14:331–355 4. Embrechts P, Puccetti G (2006) Aggregating risk capital, with an application to operational risk. Geneva Risk Insur 31(2):71–90 5. Kane SJ, Bartholomew-Biggs MC (2009) Optimising omega. J Global Optim 45(1) 6. Genetic Algorithm and Direct Search ToolboxTM 2 user’s guide, Mathworks. http://www.mathworks.com/access/helpdesk/help/pdf_doc/gads/gads_tb.pdf 7. User’s guide for TOMLAB/LGO1TOMLAB. http://tomopt.com/docs/TOMLAB_LGO.pdf 8. NAG Library Routine Document E05JBF, NAG. http://www.nag.co.uk/numeric/FL/ nagdoc_fl22/pdf/E05/e05jbf.pdf

Chapter 35

Increasing the Sensitivity of Variability EWMA Control Charts Saddam Akber Abbasi and Arden Miller

Abstract Control chart is the most important statistical process control (SPC) tool used to monitor reliability and performance of manufacturing processes. Variability EWMA charts are widely used for the detection of small shifts in process dispersion. For ease in computation all the variability EWMA charts proposed so far are based on asymptotic nature of control limits. It has been shown in this study that quick detection of initial out-of-control conditions can be achieved by using exact or time varying control limits. Moreover the effect of fast initial response (FIR) feature, to further increase the sensitivity of variability EWMA charts for detecting process shifts, has not been studied so far in SPC literature. It has been observed that FIR based variability EWMA chart is more sensitive to detect process shifts than the variability charts based on time varying or asymptotic control limits.

35.1 Introduction Control charts, introduced by Walter A. Shewhart in 1920s are the most important statistical process control (SPC) tool used to monitor the reliability and performance of manufacturing processes. The basic purpose of implementing control chart procedures is to detect abnormal variations in the process parameters

S. A. Abbasi (&) A. Miller Department of Statistics, The University of Auckland, Private Bag 92019, Auckland 1142, New Zealand e-mail: [email protected] A. Miller e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_35, Ó Springer Science+Business Media B.V. 2011

431

432

S. A. Abbasi and A. Miller

(location and scale). Although first proposed for the manufacturing industry, control charts have now been applied in a wide variety of disciplines, such as nuclear engineering [1], health care [2], education [3] and analytical laboratories [4, 5]. Shewhart-type control charts are the most widely used: process location is usually monitored by an X chart and process dispersion by an R or S chart. Research has shown that, due to the memoryless nature of Shewhart control charts, they do not perform well for the detection of small and moderate shifts in the process parameters. When quick detection of small shifts is desirable, cumulative sum (CUSUM) and exponentially weighted moving average (EWMA) charts are superior alternatives to Shewhart charts (for details see [6, 7]). Since the introduction of EWMA chart by [8], many researchers have examined these charts from different perspectives—see for example [5, 9–16] and the references therein. In contrast to Shewhart type charts which are only based on information of the current observations, EWMA charts make use of information from historical observations as well by adopting a varying weight scheme: the highest weight is assigned to the most recent observations and the weights decreasing exponentially for less recent observations. This helps in the earlier detection of small shifts in process (location and scale) parameters (see [6]). Monitoring process variability using EWMA chart has also attracted the attention of many researchers. Some important contributions are [17–22]. Recently [15] proposed a new EWMA chart for monitoring process dispersion, the NEWMA chart, and showed that the NEWMA chart outperformed the variability EWMA chart proposed by [19] in terms of average run length. All the variability EWMA schemes proposed so far are based on asymptotic nature of control limits. Ease of computation has been reported as the main reason for using asymptotic limits but this makes the EWMA chart insensitive to start up quality problems. It should be noted that the exact control limits of the EWMA charts vary with time and approach the asymptotic limits as time increases (see [6]). When the process is initially out-of-control, it is extremely important to detect the sources of these out-of-control conditions as early as possible so that corrective actions can be taken at an early stage. This can be achieved by using the exact limits instead of the asymptotic control limits. The sensitivity of time varying EWMA chart can be increased further by narrowing the time varying limits at process startup or adding a head start feature. In SPC framework this feature is well known as fast initial response (FIR) (for details see [6]). The effect of FIR feature for increasing the sensitivity of variability EWMA charts has not been investigated so far in SPC literature. This study investigates the performance of variability EWMA charts that use asymptotic, time varying and FIR based control limits. The comparison has been made on the basis of run length characteristics such as average run length (ARL), median run length (MDRL) and standard deviation of run length distribution (SDRL). To investigate the effect of time varying control limits and of FIR on variability EWMA chart performance, we use the NEWMA chart which was recently proposed by [15] in Journal of Quality Technology. Time varying and FIR based

35

Increasing the Sensitivity of Variability EWMA Control Charts

433

control limits are constructed for the NEWMA chart and their performance is compared to that of asymptotic control limits. The rest of the study is organized as follows: Sect. 35.2 briefly introduces structure of the NEWMA chart and further presents the design of the NEWMA chart using time varying control limits (TNEWMA chart). The next section compares run length characteristics of NEWMA and TNEWMA charts. The effect of FIR feature is then investigated and compared to asymptotic and time varying EWMA schemes. To get a better insight on the run length distribution of these charts, run length curves are also presented. The chapter finally ends with concluding remarks.

35.2 TNEWMA Chart In this section we briefly describe the structure of NEWMA chart as was proposed by [15] and construct time varying control limits for this chart. Assume the quality variable of interest X follows a normal distribution with mean lt and variance r2t (i.e. X Nðlt ; r2t Þ). Let S2t represents the sample variance and dt represents the ratio of process standard deviation rt and its true value r0 at time period t (i.e. dt ¼ rt =r0 Þ: Suppose Yt ¼ lnðS2t =r20 Þ; for an in-control process i.e. rt ¼ r0 ; Yt is approximately normally distributed with mean lY and variance r2Y where lY ¼ lnðd2t Þ

1 1 2 þ n 1 3ðn 1Þ2 15ðn 1Þ4

ð35:1Þ

and r2Y ¼

2 2 4 16 þ þ : 2 3 n 1 ðn 1Þ 3ðn 1Þ 15ðn 1Þ5

ð35:2Þ

Note that when the process is in control the statistic Zt ¼

Yt lY jrt ¼ r0 rY

ð35:3Þ

is exactly a standard normal variate. When the process is out of control, 2 2 Zt Nðct ; 1Þ; where ct ¼ lnðrt =r0 Þ =rY [15]. The EWMA statistic for monitoring process variability used by [15] is based on resetting Zt to zero whenever its value becomes negative i.e. Ztþ ¼ maxð0; Zt Þ: The NEWMA chart is based on plotting the EWMA statistic 1 Wt ¼ k Ztþ þ ð1 kÞWt1 ; ð35:4Þ 2p where the smoothing constant k is the weight assigned to most recent sample observation (0 k 1). Small values of k are effective for quick detection of small

434

S. A. Abbasi and A. Miller

process shifts. As the value of k increases the NEWMA chart performs better for the detection of large process shifts. An out of control signal occurs whenever Wt [ UCLa where rﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ k UCLa ¼ La ð35:5Þ r þ: 2 k Zt Ref. [23] showed that r2Z þ ¼ t

1 1 : 2 2p

ð35:6Þ

We will see that the exact variance of Wt is time varying and hence the exact control limit should be dependent on time approaching UCLa as t ! 1: By 1 ; we can write Wt as Defining Zt0 ¼ Ztþ 2p Wt ¼ kZt0 þ ð1 kÞWt1 :

ð35:7Þ

By continuous substitution of Wti ; i ¼ 1; 2; . . .; t; the EWMA statistic Wt can be written as (see [6, 8]): Wt ¼ k

t1 X

0 ð1 kÞi Zti þ ð1 kÞt W0 :

ð35:8Þ

i¼0

Taking the variance of both sides, we obtain VarðWt Þ ¼ k2

t1 X 0 ð1 kÞ2i VarðZti Þ þ ð1 kÞ2t VarðW0 Þ:

ð35:9Þ

i¼0 0 Þ ¼ r2Z þ : After a bit of For independent random observations Zt0 ; varðZt0 Þ ¼ varðZti t

simplification, we have " VarðWt Þ ¼

r2Z þ t

k

2

1 ð1 kÞ2t 1 ð1 kÞ2

#! :

ð35:10Þ

This further simplifies to VarðWt Þ ¼ r2Z þ t

h i k 1 ð1 kÞ2t : 2k

ð35:11Þ

For the rest of study we will refer to the variability EWMA chart based on exact variance of Wt given in Eq. 35.11 as the TNEWMA chart. The TNEWMA chart gives an out of control signal whenever Wt [ UCLt ; where sﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ k½1 ð1 kÞ2t rZtþ : ð35:12Þ UCLt ¼ Lt 2k

35

Increasing the Sensitivity of Variability EWMA Control Charts

435

UCLt converges to UCLa as t ! 1; where the rate of convergence is slower for smaller values of k:

35.3 Comparison of Run Length Characteristics of NEWMA and TNEWMA Charts To evaluate the performance of control charts, the average run length (ARL) is the most important and widely used measure. ARL indicates the mean number of observations until an out of control signal is detected by a control chart. In this study, a Monte Carlo simulation with 10,000 iterations is used to approximate run length distributions of the NEWMA and TNEWMA charts following the methods of [9, 24, 25, 26]. Note that [27, 28] indicates that even 5,000 replications are enough for finding ARLs in many control chart settings with in an acceptable error rate. To get a better insight of the performance of the proposed charts, the median and the standard deviation of the run length distribution are also provided. The summary of the run length characteristics of NEWMA and TNEWMA charts is reported in Tables 35.1 and 35.2 for different values of smoothing parameter k: In the following tables ARL denotes the average run length, SDRL denotes the standard deviation of the run length distribution and MDRL denotes the median of the run length distribution. In each table, smoothing constant k increases as we move across columns from left to right where as shift d increases as we move across rows from top to bottom. The rows corresponding to d ¼ 1 provides the run length characteristics of both charts when the process is assumed to be in statistical control. The process is said to be out-of-control for d [ 1:0: Control chart multiples La and Lt are so chosen as to give the same in control average run length of 200 (i.e. ARL0 ¼ 200) for both the charts. The results in Tables 35.1 and 35.2 indicate that for smaller values of k (which is most popular choice for EWMA charts), the out-of-control ARL (ARL1 ) of the TNEWMA chart is significantly lower than the ARL1 of NEWMA chart, see for example ARL1 ¼ 9:93 for TNEWMA chart using k ¼ 0:05 and d ¼ 1:2 while for NEWMA chart ARL1 ¼ 14:52 for same values of k and d: It indicates that TNEWMA chart requires on average nearly five less observations as compared to NEWMA chart to detect a shift of 1.2r in process variability when k ¼ 0:05: MDRL of the TNEWMA chart is also lower than MDRL of the NEWMA chart while there is a slight increase in SDRL of the TNEWMA chart as compared to NEWMA chart for lower values of k and d: Figure 35.1 presents ARL comparison of NEWMA and TNEWMA charts for some choices of k: In each plot, the size of multiplicative shift in process variability d is plotted on horizontal axis while ARL is plotted on vertical axis in logarithmic scale for better visual comparison. The effect of using time varying control limits can be clearly seen from Fig. 35.1, particularly for smaller values of k: As expected, ARL of TNEWMA chart starts to converge

436

S. A. Abbasi and A. Miller

Table 35.1 Run length characteristics of NEWMA chart when ARL0 ¼ 200 k d La 0.05 1.569

0.10 1.943

0.15 2.148

0.20 2.271

0.25 2.362

0.30 2.432

0.50 2.584

0.70 2.650

0.90 2.684

1.00 2.693

1.0 ARL 199.69 200.74 200.24 199.80 199.23 200.39 199.88 199.82 199.11 199.52 MDRL 136.00 140.50 139.00 137.00 138.00 142.00 139.00 139.00 141.00 137.00 SDRL 197.62 198.83 200.98 197.14 197.55 203.09 197.02 196.87 195.46 202.39 1.1 ARL 31.68 35.33 37.69 40.11 41.38 43.26 49.42 54.11 61.16 65.19 MDRL 24.00 26.00 28.00 29.00 29.00 31.00 35.00 38.00 43.00 46.00 SDRL 26.26 30.77 34.30 36.79 38.99 41.50 48.52 55.14 59.77 64.62 1.2 ARL 14.52 14.81 15.48 15.97 16.48 17.30 19.66 22.22 25.87 28.44 MDRL 12.00 12.00 12.00 12.00 12.00 13.00 14.00 16.00 18.00 20.00 SDRL 10.05 11.06 12.12 13.13 13.93 15.41 18.54 21.10 25.47 27.88 1.3 ARL 9.21 9.06 9.07 9.25 9.34 9.57 10.34 11.73 13.89 14.97 MDRL 8.00 8.00 7.00 7.00 7.00 7.00 8.00 8.00 10.00 10.00 SDRL 5.49 5.87 6.43 6.83 7.35 7.55 8.89 11.04 13.51 14.43 1.4 ARL 6.72 6.53 6.45 6.42 6.46 6.47 6.76 7.31 8.46 9.23 MDRL 6.00 6.00 5.00 5.00 5.00 5.00 5.00 5.00 6.00 7.00 SDRL 3.67 3.92 4.09 4.26 4.54 4.71 5.63 6.43 7.78 8.85 1.5 ARL 5.30 5.18 4.98 4.87 4.81 4.78 4.86 5.15 5.86 6.38 MDRL 5.00 5.00 4.00 4.00 4.00 4.00 4.00 4.00 4.00 5.00 SDRL 2.69 2.83 2.89 2.93 3.12 3.26 3.74 4.32 5.22 5.93 1.6 ARL 4.52 4.27 4.10 3.99 3.93 3.87 3.83 3.95 4.41 4.61 MDRL 4.00 4.00 4.00 3.00 3.00 3.00 3.00 3.00 3.00 3.00 SDRL 2.15 2.16 2.25 2.30 2.39 2.42 2.72 3.10 3.79 4.03 1.7 ARL 3.91 3.67 3.50 3.41 3.31 3.27 3.15 3.19 3.43 3.69 MDRL 4.00 3.00 3.00 3.00 3.00 3.00 3.00 2.00 3.00 3.00 SDRL 1.74 1.74 1.81 1.83 1.89 1.95 2.15 2.40 2.78 3.16 1.8 ARL 3.47 3.26 3.11 3.01 2.91 2.85 2.68 2.70 2.86 3.01 MDRL 3.00 3.00 3.00 3.00 3.00 2.00 2.00 2.00 2.00 2.00 SDRL 1.46 1.52 1.52 1.54 1.60 1.63 1.78 1.90 2.26 2.48 1.9 ARL 3.17 2.96 2.83 2.71 2.62 2.51 2.37 2.37 2.46 2.58 MDRL 3.00 3.00 3.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 SDRL 1.30 1.30 1.34 1.39 1.40 1.38 1.47 1.62 1.84 2.00 2.0 ARL 2.92 2.73 2.59 2.43 2.37 2.29 2.13 2.12 2.16 2.24 MDRL 3.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 SDRL 1.13 1.17 1.18 1.18 1.19 1.21 1.29 1.41 1.54 1.69

towards ARL of NEWMA chart with an increase in k: At k ¼ 1; UCLt ¼ UCLa as the factor ð1 ð1 kÞ2t Þ reduces to 1 and hence the ARL performance of both the charts is similar. Moreover Fig. 35.2 shows percentage decrease in ARL1 of TNEWMA chart as compared to NEWMA chart for certain choices of k and d: We can see that the difference in ARL1 of both the charts is bigger for smaller values of k and higher values of d: The difference tends to reduce as k increases and d decreases. Hence the use of exact control limits also improves variability EWMA chart performance for detecting shifts of higher magnitude.

35

Increasing the Sensitivity of Variability EWMA Control Charts

437

Table 35.2 Run length characteristics of TNEWMA chart when ARL0 ¼ 200 k k Lt 0.05 1.649

0.10 1.975

0.15 2.164

0.20 2.279

0.25 2.379

0.30 2.440

0.50 2.588

0.70 2.652

0.90 2.685

1.00 2.693

1.0 ARL 199.85 200.82 200.30 200.19 200.29 200.37 200.13 200.63 199.76 199.73 MDRL 124.00 134.00 136.50 134.00 138.00 137.00 139.00 141.00 140.00 140.00 SDRL 209.87 206.10 212.73 199.28 208.94 202.42 200.30 199.74 200.94 194.83 1.1 ARL 25.80 31.45 34.71 37.99 40.25 42.42 49.21 54.01 61.15 65.18 MDRL 16.00 22.00 25.00 26.00 28.00 30.00 34.00 38.00 43.00 45.00 SDRL 28.83 32.55 34.31 37.60 40.52 41.58 49.17 53.70 60.63 63.66 1.2 ARL 9.93 11.91 13.18 14.08 15.52 16.65 19.29 21.96 25.67 28.43 MDRL 6.00 8.00 10.00 10.00 11.00 12.00 14.00 15.00 18.00 20.00 SDRL 10.36 11.59 12.53 13.07 14.56 15.26 18.48 21.30 25.29 28.12 1.3 ARL 5.61 6.64 7.37 7.81 8.33 8.66 9.98 11.39 13.66 14.94 MDRL 4.00 5.00 6.00 6.00 6.00 6.00 7.00 8.00 10.00 11.00 SDRL 5.64 6.09 6.54 6.88 7.30 7.73 9.41 10.90 13.12 14.40 1.4 ARL 3.78 4.56 4.92 5.19 5.46 5.67 6.48 7.09 8.31 9.23 MDRL 3.00 3.00 4.00 4.00 4.00 4.00 5.00 5.00 6.00 7.00 SDRL 3.49 3.97 4.21 4.33 4.60 4.69 5.67 6.26 8.05 8.67 1.5 ARL 2.84 3.33 3.68 3.86 4.03 4.14 4.61 5.01 5.78 6.38 MDRL 2.00 2.00 3.00 3.00 3.00 3.00 4.00 4.00 4.00 5.00 SDRL 2.49 2.78 3.01 3.10 3.26 3.28 3.80 4.40 5.24 5.82 1.6 ARL 2.32 2.70 2.94 3.04 3.17 3.27 3.60 3.82 4.30 4.58 MDRL 2.00 2.00 2.00 2.00 3.00 3.00 3.00 3.00 3.00 3.00 SDRL 1.90 2.16 2.25 2.30 2.37 2.48 2.82 3.16 3.69 4.17 1.7 ARL 2.00 2.29 2.43 2.55 2.66 2.68 2.92 3.07 3.38 3.65 MDRL 1.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 3.00 3.00 SDRL 1.50 1.70 1.81 1.85 1.95 1.95 2.16 2.41 2.77 3.08 1.8 ARL 1.77 2.00 2.14 2.23 2.28 2.26 2.47 2.59 2.81 3.01 MDRL 1.00 1.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 SDRL 1.23 1.42 1.53 1.58 1.60 1.64 1.78 1.92 2.24 2.47 1.9 ARL 1.60 1.78 1.91 1.97 2.02 2.03 2.18 2.27 2.39 2.56 MDRL 1.00 1.00 1.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 SDRL 1.04 1.17 1.28 1.30 1.34 1.35 1.47 1.60 1.90 2.06 2.0 ARL 1.47 1.64 1.72 1.76 1.82 1.84 1.95 2.02 2.09 2.24 MDRL 1.00 1.00 1.00 1.00 1.00 1.00 2.00 2.00 2.00 2.00 SDRL 0.91 1.03 1.09 1.14 1.17 1.17 1.26 1.33 1.52 1.64

35.4 Effect of Fast Initial Response on Variability EWMA Chart We have seen in the previous section that the use of time varying control limits as compared to asymptotic limits significantly improves the out-of-control run length behavior of variability EWMA charts. A further increase in the sensitivity of EWMA chart to detect shifts in variability can be achieved by using an FIR feature. The FIR feature, introduced by [29] for CUSUM charts, detects

438

S. A. Abbasi and A. Miller λ = 0.25

0.0 0.5 1.0 1.5 2.0 2.5

NEWMA TNEWMA Log (ARL)

Log (ARL)

0.0 0.5 1.0 1.5 2.0 2.5

λ = 0.15

NEWMA TNEWMA

NEWMA TNEWMA

1.0 1.2 1.4 1.6 1.8 2.0

1.0 1.2 1.4 1.6 1.8 2.0

δ

δ

δ

λ = 0.50

λ = 0.70

λ = 1.00

1.0 1.2 1.4 1.6 1.8 2.0 δ

1.0 1.2 1.4 1.6 1.8 2.0

0.0 0.5 1.0 1.5 2.0 2.5

NEWMA TNEWMA Log (ARL)

Log (ARL)

NEWMA TNEWMA

0.0 0.5 1.0 1.5 2.0 2.5

Log (ARL) Log (ARL)

0.0 0.5 1.0 1.5 2.0 2.5

0.0 0.5 1.0 1.5 2.0 2.5

λ = 0.05

NEWMA TNEWMA

1.0 1.2 1.4 1.6 1.8 2.0

1.0 1.2 1.4 1.6 1.8 2.0

δ

δ

40 35 30 25 20 15 10

Percentage Decrease in ARL

45

δ= 1.2 δ= 1.4 δ= 1.6 δ= 1.8 δ= 2.0

0

5

Fig. 35.2 Percentage decrease in out-of-control ARL of TNEWMA chart as compared to NEWMA chart when ARL0 ¼ 200

50

Fig. 35.1 ARL comparison of NEWMA and TNEWMA charts for different values of k when ARL0 ¼ 200

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

λ

out-of-control signals more quickly at process startup by assigning some nonzero constant to the starting values of CUSUM chart statistics. Lucas and Saccucci [9] proposed the idea of applying the FIR feature to EWMA control structures by using two one-sided EWMA charts. Rhoads et al. [30] used the

35

Increasing the Sensitivity of Variability EWMA Control Charts

439

FIR approach for time varying control limits and showed superior performance of their proposed scheme compared to the [9] FIR scheme. Both these schemes were criticized as they require the use of two EWMA charts instead of one for monitoring changes in process parameters. Steiner [11] presented another FIR scheme for EWMA charts. His proposal is based on further narrowing the time varying control limits by using an exponentially decreasing FIR adjustment which is defined as FIRadj ¼ 1 ð1 f Þ1þaðt1Þ ;

ð35:13Þ

where a is known as the adjustment parameter and is chosen such that the FIR adjustment has very little effect after a specified time period say at t ¼ 20; we have FIRadj ¼ 0:99: The effect of this FIR adjustment decreases with time and makes the control limit a proportion f of the distance from the starting value [11]. By comparing run length characteristics, Steiner [11] showed that his proposed FIR scheme outperformed the previous FIR schemes by [9, 30]. The FIR adjustment used by [11] is very attractive and has also been recently applied by [31] to generally weighted moving average control charts. In this section we examine the effect of FIR on the performance of variability EWMA chart. The time varying variability EWMA chart using FIR will be referred as the FNEWMA chart for the rest of study. The FNEWMA chart signals an out-of-control condition whenever Wt exceeds UCLf ; where UCLf is given as sﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ k½1 ð1 kÞ2t rZtþ : ð35:14Þ UCLf ¼ Lf 1 ð1 f Þ1þaðt1Þ 2k To obtain a substantial benefit from FIR feature, f should be fairly small. In this study we used f ¼ 0:5 and limited the effect of FIR adjustment till t ¼ 20 following [11, 31]. The run length characteristics of FNEWMA chart are reported in Table 35.3. ARL0 for FNEWMA chart is also fixed at 200 by using appropriate Lf values for different choices of k: By comparing results in Tables 35.1, 35.2 and 35.3, we can observe the superior run length performance of the FNEWMA chart as compared to the NEWMA and TNEWMA charts. For example, the FNEWMA chart has ARL1 ¼ 10:72 for k ¼ 0:3 and d ¼ 1:2; while the corresponding ARL1 for the TNEWMA and NEWMA charts are 16.65 and 17.30 respectively. This indicates that the FNEWMA chart requires on average nearly six less observations as compared to the NEWMA and TNEWMA charts to detect a shift of 1:2r in process variability when k ¼ 0:3: Figure 35.3 compares the ARLs of the NEWMA, TNEWMA and FNEWMA charts for some choices of k: We can easily observe that the ARL1 of the FNEWMA chart is consistently lower than the ARL1 of both NEWMA and TNEWMA charts for every choice of k: This indicates that the FNEWMA chart detects shifts in process variability more quickly than the other two charts, the difference seems greater for higher values of k which is consistent with the findings of [11].

440

S. A. Abbasi and A. Miller

Table 35.3 Run Length Characteristics of FNEWMA chart when ARL0 ¼ 200 k d Lf 0.05 1.740

0.10 2.071

0.15 2.241

0.20 2.369

0.25 2.460

0.30 2.530

0.50 2.670

0.70 2.736

0.90 2.770

1.00 2.784

1.0 ARL 199.24 200.49 199.63 199.99 200.21 200.74 199.96 200.29 199.55 199.61 MDRL 94.00 115.50 117.00 121.00 122.00 122.00 117.50 114.00 109.00 109.00 SDRL 263.25 249.11 244.59 239.14 240.99 242.92 241.41 249.86 253.24 251.52 1.1 ARL 21.17 25.84 28.27 31.12 31.40 34.36 38.57 41.54 45.85 50.85 MDRL 7.00 12.00 14.00 15.00 14.00 15.00 16.00 14.50 14.00 16.00 SDRL 29.33 34.37 36.66 40.77 41.93 46.10 52.51 58.65 68.32 74.19 1.2 ARL 8.02 8.85 9.16 9.65 10.29 10.72 11.91 13.07 14.25 15.19 MDRL 4.00 4.00 4.00 4.00 4.00 5.00 5.00 4.00 4.00 4.00 SDRL 10.14 11.57 12.57 12.72 13.77 14.46 17.20 20.49 22.95 27.47 1.3 ARL 4.08 4.65 4.77 5.06 5.24 5.29 5.61 5.88 6.53 7.07 MDRL 2.00 2.00 2.00 2.00 3.00 3.00 3.00 3.00 3.00 3.00 SDRL 5.15 5.75 5.83 6.16 6.60 6.68 7.53 8.35 10.09 11.57 1.4 ARL 2.76 3.12 3.28 3.28 3.29 3.39 3.45 3.56 3.78 3.96 MDRL 1.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 SDRL 3.18 3.48 3.60 3.65 3.62 3.84 4.07 4.31 5.23 5.63 1.5 ARL 2.11 2.33 2.42 2.48 2.50 2.54 2.58 2.51 2.65 2.80 MDRL 1.00 1.00 1.00 1.00 1.00 2.00 2.00 2.00 2.00 2.00 SDRL 2.07 2.30 2.38 2.49 2.42 2.51 2.60 2.56 2.95 3.34 1.6 ARL 1.80 1.94 1.99 2.02 2.06 2.06 2.09 2.12 2.13 2.19 MDRL 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 SDRL 1.57 1.73 1.74 1.79 1.84 1.83 1.82 1.90 1.93 2.16 1.7 ARL 1.58 1.68 1.70 1.75 1.76 1.80 1.77 1.79 1.83 1.87 MDRL 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 SDRL 1.20 1.30 1.32 1.38 1.38 1.44 1.40 1.38 1.48 1.57 1.8 ARL 1.44 1.52 1.55 1.57 1.59 1.58 1.61 1.61 1.62 1.61 MDRL 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 SDRL 0.96 1.07 1.10 1.12 1.13 1.11 1.12 1.12 1.18 1.16 1.9 ARL 1.34 1.40 1.43 1.44 1.47 1.46 1.46 1.48 1.49 1.50 MDRL 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 SDRL 0.79 0.89 0.91 0.91 0.95 0.93 0.90 0.93 0.93 0.98 2.0 ARL 1.26 1.32 1.34 1.36 1.36 1.38 1.40 1.39 1.40 1.41 MDRL 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 SDRL 0.66 0.74 0.79 0.78 0.77 0.81 0.81 0.80 0.83 0.84

To get more insight into the run length distributions of the NEWMA, TNEWMA and FNEWMA charts, Fig. 35.4 presents run length curves (RLCs) of these charts for certain values of k using d ¼ 1:2: We can observe that for smaller values of k; RLCs of TNEWMA chart are higher than RLCs of NEWMA chart indicating that TNEWMA chart has greater probability for shorter run lengths for these k values. The superiority of FNEWMA chart over NEWMA and TNEWMA charts is also clear for all values of k: Note that this high probability at shorter run lengths indicate that the shifts in the process variability will be detected quickly with high probability.

Increasing the Sensitivity of Variability EWMA Control Charts

2.5 2.0 0.5 0.0

1.0

1.4

1.8

1.0

1.4

1.8 δ

λ = 0.50

λ = 0.70

λ = 1.00

2.0 0.5 0.0

0.5 0.0 1.8 δ

NEWMA TNEWMA FNEWMA

1.5

1.5

Log (ARL)

2.0

NEWMA TNEWMA FNEWMA

1.0

Log (ARL)

2.0 1.5 1.0 0.5 0.0

1.4

2.5

δ

2.5

δ

NEWMA TNEWMA FNEWMA

1.0

1.5

Log (ARL)

1.5

1.8

NEWMA TNEWMA FNEWMA

1.0

2.5 2.0

NEWMA TNEWMA FNEWMA

0.0

0.0

1.4

2.5

1.0

Log (ARL)

λ = 0.25

0.5

1.0

1.5

Log (ARL)

2.0

NEWMA TNEWMA FNEWMA

0.5

Log (ARL)

λ = 0.15

1.0

2.5

λ = 0.05

441

1.0

35

1.0

1.4

1.8 δ

1.0

1.4

1.8 δ

Fig. 35.3 ARL comparison of NEWMA, TNEWMA and FNEWMA charts for different values of k when ARL0 ¼ 200

35.5 Conclusions This chapter examines the performance of variability EWMA chart using asymptotic, time varying and FIR based control limits. It has been shown that the ability of the variability EWMA chart to detect shifts in variation can be improved by using exact (time varying limits) instead of asymptotic control limits, particularly for smaller values of smoothing parameter k: The FIR feature has also shown to contribute significantly to further increase the sensitivity of the EWMA chart to detect shifts in process variability. Computations have been performed using NEWMA chart but these results can be generalized for the other variability EWMA charts discussed in Sect. 35.1. This study will help quality practitioners to choose a more sensitive variability EWMA chart.

442

S. A. Abbasi and A. Miller

30

40

0.8 0.6 0.4 0.2

10

20

30

40

40

60

Run Length

80

100

60

0.4

0.6

0.8

50

NEWMA TNEWMA FNEWMA

0.2

Cumulative Probability

0.4

0.6

0.8

1.0

λ = 0.90

0.2

20

0

λ = 0.50

NEWMA TNEWMA FNEWMA 0

Cumulative Probability

50

Run Length

0.0

Cumulative Probability

20

Run Length

0.0

10

1.0

0

NEWMA TNEWMA FNEWMA

0.0

0.8 0.6 0.4 0.2

NEWMA TNEWMA FNEWMA

0.0

Cumulative Probability

1.0

λ = 0.20

1.0

λ = 0.05

0

20

40

60

80

100 120

Run Length

Fig. 35.4 Run length curves of NEWMA, TNEWMA and FNEWMA charts for different values of k when d ¼ 1:2 and ARL0 ¼ 200

References 1. Hwang SL, Lin JT, Liang GF, Yau YJ, Yenn TC, Hsu CC (2008) Application control chart concepts of designing a pre-alarm system in the nuclear power plant control room. Nucl Eng Design 238(12):3522–3527 2. Woodall WH (2006) The use of control charts in health-care and public-health surveillance. J Qual Technol 38(2):89–104 3. Wang Z, Liang R (2008) Discuss on applying SPC to quality management in university education. In: Proceedings of the 9th international conference for young computer scientists, ICYCS 2008, pp 2372–2375 4. Masson P (2007) Quality control techniques for routine analysis with liquid chromatography in laboratories. J Chromatogr A 1158(1–2):168–173 5. Abbasi SA (2010) On the performance of EWMA chart in presence of two component measurement error. Qual Eng 22(3):199–213 6. Montgomery DC (2001) Introduction to statistical quality control, 4th edn. Wiley, New York 7. Ryan PR (2000) Statistical methods for quality improvement, 2nd edn. Wiley, New York 8. Roberts SW (1959) Control chart tests based on geometric moving averages. Technometrics 1(3):239–250

35

Increasing the Sensitivity of Variability EWMA Control Charts

443

9. Lucas JM, Saccucci MS (1990) Exponentially weighted moving average control schemes. Properties and enhancements. Technometrics 32(1):1–12 10. Montgomery DC, Torng JCC, Cochran JK, Lawrence FP (1995) Statistically constrained economic design of the EWMA control chart. J Qual Technol 27(3):250–256 11. Steiner SH (1999) EWMA control charts with time-varying control limits and fast initial response. J Qual Technol 31(1):75–86 12. Chan LK, Zhang J (2000) Some issues in the design of EWMA charts. Commun Stat Part B Simul Comput 29(1):207–217 13. Maravelakis PE, Panaretos J, Psarakis S (2004) EWMA chart and measurement error. J Appl Stat 31(4):445–455 14. Carson PK, Yeh AB (2008) Exponentially weighted moving average (EWMA) control charts for monitoring an analytical process. Ind Eng Chem Res 47(2):405–411 15. Shu L, Jiang W (2008) A new EWMA chart for monitoring process dispersion. J Qual Technol 40(3):319–331 16. Abbasi SA (2010) On sensitivity of EWMA control chart for monitoring process dispersion. In: Lecture notes in engineering and computer science: proceedings of the World Congress on engineering 2010, vol III, WCE 2010, 30 June–2 July, 2010, London, UK, pp 2027–2032 17. Wortham AW, Ringer LJ (1971) Control via exponential smoothing. Transportation Logistic Rev 7:33–39 18. Domangue R, Patch SC (1991) Some omnibus exponentially weighted moving average statistical process monitoring schemes. Technometrics 33:299–313 19. Crowder SV, Hamilton M (1992) Average run lengths of EWMA controls for monitoring a process standard deviation. J Qual Technol 24:44–50 20. MacGregor JF, Harris TJ (1993) The exponentially weighted moving variance. J Qual Technol 25:106–118 21. Stoumbos ZG, Reynolds MR Jr (2000) Robustness to non normality and autocorrelation of individual control charts. J Stat Comput Simul 66:145–187 22. Chen GM, Cheng SW, Xie HS (2001) Monitoring process mean and variability with one EWMA chart. J Qual Technol 33:223–233 23. Barr DR, Sherrill ET (1999) Mean and variance of truncated normal distributions. Am Stat 53:357–361 24. Maravelakis P, Panaretos J, Psarakis S (2005) An examination of the robustness to nonnormality of the EWMA control charts for the dispersion. Commun Stat Simul Comput 34(4):1069–1079 25. Neubauer AS (1997) The EWMA control chart: properties and comparison with other quality-control procedures by computer simulation. Clin Chem 43(4):594–601 26. Zhang L, Chen G (2004) EWMA charts for monitoring the mean of censored Weibull lifetimes. J Qual Technol 36(3):321–328 27. Kim MJ (2005) Number of replications required in control chart Monte Carlo simulation studies. PhD Dissertation, University of Northern Colorado 28. Schaffer JR, Kim MJ (2007) Number of replications required in control chart Monte Carlo simulation studies. Commun Stat Simul Comput 36(5):1075–1087 29. Lucas JM, Crosier RB (1982) Fast initial response for CUSUM quality control schemes: give your CUSUM a head start. Technometrics 24(3):199–205 30. Rhoads TR, Montgomery DC, Mastrangelo CM (1996) A fast initial response scheme for the exponentially weighted moving average control chart. Qual Eng 9(2):317–327 31. Chiu WC (2009) Generally weighted moving average control charts with fast initial response features. J Appl Stat 36(3):255–275

Chapter 36

Assessing Response’s Bias, Quality of Predictions, and Robustness in Multiresponse Problems Nuno Costa, Zulema Lopes Pereira and Martín Tanco

Abstract Optimization measures for evaluating compromise solutions in multiresponse problems formulated in the Response Surface Methodology framework are proposed. The measures take into account the desired properties of responses at optimal variable settings, namely, the bias, quality of predictions and robustness, which allow the analyst to achieve compromise solutions of interest and feasible in practice, namely in the case of a method that does not consider in the objective function the responses’ variance level and correlation information is used. Two examples from the literature show the utility of the proposed measures.

36.1 Introduction Statistical tools and methodologies like the response surface methodology (RSM) have been increasingly used in industry and became a change agent in the way design and process engineers think and work [9]. In particular, RSM has been used for developing more robust systems (process and product), improving and N. Costa (&) Setúbal Polytechnic Institute, College of Technology, Campus do IPS, Estefanilha, 2910-761 Setúbal, Portugal e-mail: [email protected] N. Costa Z. L. Pereira UNIDEMI/DEMI, Faculdade de Ciências e Tecnologia, Universidade Nova de Lisboa, 2829-516 Caparica, Portugal e-mail: [email protected] M. Tanco CITEM, Universidad de Montevideo, Luis P. Ponce, 1307, 11300 Montevideo, Uruguay e-mail: [email protected]

S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_36, Ó Springer Science+Business Media B.V. 2011

445

446

N. Costa et al.

optimizing systems performance with the required efficiency and effectiveness. The readers are referred to Myers et al. [18] for a thoroughly discussion on this methodology. While most case studies reported in the literature focus on the optimization of one single quality characteristic of process or product, the variety of real-life problems requires the consideration of multiple quality characteristics (objectives; responses). This fact and the researchers’ desire to propose enhanced techniques using recent advancements in mathematical optimization, scientific computing and computer technology have been making the multiresponse optimization an active research field. New algorithms and methodologies have been developed and their diffusion into various disciplines has proceeded at a rapid pace. To date, researchers are paying great attention to hybrid approaches to avoid premature algorithm convergence toward a local maximum or minimum and reach the global optimum in problems with multiple responses [24]. The readers are referred to Younis and Dong [25] for a review on historical development, special features and trends on the development of global optimization algorithms. These authors also examine and compare a number of representatives and recently introduced global optimization techniques. The issue is that the level of computational and mathematical or statistical expertise required for using those algorithms or methodologies and solving such problems successfully is significant. This makes such sophisticated tools hard to adopt, in particular, by practitioners [1]. A strategy widely used for optimizing multiple responses in the RSM framework consists of converting the multiple responses into a single (composite) function followed by its optimization, using either the generalized reduced gradient or sequential quadratic programming algorithms available in the popular Microsoft ExcelÒ (Solver add-in) and MatlabÒ (fmincon routine), respectively. To form that composite function, the desirability function-based and loss functionbased methods are the most popular among practitioners. The existing methods use distinct composite functions to provide indication about how close the response values are from their target, but the widely used desirability-based methods do not consider the responses’ variance level and correlation information, and the composite function does not give information on it to the analyst. What the analyst knows is that either a higher or a lower value is preferred, depending on how the composite function is defined. The composite functions of the loss function-based methods present the result in monetary terms, that is, the compromise solution is expressed by a monetary loss that must be as low as possible, and some of those composite functions consider the variance– covariance structure of responses. However, the composite functions of loss- and desirability-based methods have a serious drawback. They may give inconsistent results, namely, different results for the same responses values (compromise solution). This may confound the analyst and difficult the evaluation of compromise solutions as he/she needs to check the values of each response considered in the study to identify a compromise solution of interest. This difficulty increases as larger the number of available solutions and responses are. So, the authors propose optimization performance measures with a threefold purpose:

36

Assessing Response’s Bias, Quality of Predictions, and Robustness

447

I. Provide relevant information to the analyst so that he/she may achieve compromise solutions of interest and feasible in practice whenever a method that does not consider in the objective function the variance–covariance structure of responses is used. II. Help the analyst in evaluating the feasibility of compromise solutions by assessing the response’s bias (responses deviation from their target), quality of predictions (variance due to uncertainty in the regression coefficients of predicted responses) and robustness (variance due to uncontrollable variables) separately. III. Allow the evaluation of methods solutions that cannot be compared directly due to the different approaches subjacent to those methods, for example, loss function and desirability function approaches. The feasibility of the proposed measures is illustrated through desirability and loss function-based methods by using two examples from the literature. The remaining sections are structured as follows: next section provides a review on analysis methods. Then optimization measures are introduced. The subsequent sections include the examples and the results discussion, respectively. Conclusion and future work are presented in the last section.

36.2 Methods for Multiresponse Analysis The desirability function-based and loss function-based methods are the most popular ones among practitioners who, in the RSM framework, look for optimum variable settings for the process and product whenever multiple responses are considered simultaneously. Therefore, the methods that are widely used in practice and will serve to illustrate the feasibility of optimization measures are reviewed below. Many other alternative approaches are available in the literature, and reviews on them are provided by Gauri and Pal [8], Kazemzadeh et al. [10], Murphy et al. [16].

36.2.1 Desirability-Based Methods The desirability-based methods are easy to understand, flexible for incorporating the decision-maker’s preferences (priority to responses), and the most popular of them, the so-called Derringer and Suich’s method [6], or modifications of it [5], is available in many data analysis software packages. However, to use this method the analyst needs to assign values to four shape parameters (weights). This is not a simple task and makes an impact on the optimal variable settings. An alternative desirability-based method that, under the assumptions of normality and homogeneity of error variances, requires minimum information from the user was

448

N. Costa et al.

proposed by Ch’ng et al. [2]. The method they proposed is easy to understand and implement in the readily available Microsoft ExcelÒ—Solver tool and, in addition, requires less cognitive effort from the analyst. The user only has to assign values to one type of shape parameters (weights), which is a relevant advantage over the extensively used Derringer and Suich’s method. Ch’ng et al. [2] suggested individual desirability functions of the form d¼

2^y ðU þ LÞ 2^y 2L þ1¼ þ ¼ m^y þ c UL UL UL

ð36:1Þ

where 0 d 2 and ^y represents the response’s model with upper and lower bounds defined by U and L, respectively. The global desirability (composite) function is defined as !, p X D¼ ei jdi di ðhi Þj p ð36:2Þ i¼1

where di ðhi Þ is the value of the individual desirability function i at the target value hi , ei is the weight (degree of P importance or priority) assigned to response i, p is the number of responses, and pi¼1 ei ¼ 1. The aim is to minimize D. Although Ch’ng et al. illustrate their method only for nominal-the-best (NTB—the value of the estimated response is expected to achieve a particular target value) response type, in this article the larger-the-best (LTB—the value of the estimated response is expected to be larger than a lower bound) and smallerthe-best (STB—the value of the estimated response is expected to be smaller than an upper bound) response types are also considered. In these cases, di ðUi Þ and di ðLi Þ are used in Eq. 36.2 instead of di ðhi Þ, under the assumption that it is possible to establish the specification limits U and L to those responses based on product knowledge or practical experience. To use the maximum or minimum value of the response model is also an alternative. A limitation in this method is that it does not consider the quality of predictions and robustness in the optimization process.

36.2.2 Loss Function-Based Methods The loss function approach uses a totally different idea about the multi-response optimization by considering monetary aspects in the optimization process. This approach is very popular among the industrial engineering community and, unlike the above-mentioned desirability-based methods, there are loss function-based methods that consider the responses’ variance level and exploit the responses’ correlation information, which is statistically sound. Examples of those methods were introduced by Vining [21] and Lee and Kim [13]. Vining [21] proposed a loss function-based method that allows specifying the directions of economic importance for the compromise optimum, while seriously

36

Assessing Response’s Bias, Quality of Predictions, and Robustness

449

considering the variance–covariance structure of the expected responses. This method aims at finding the variable settings that minimize an expected loss function defined as h X i ðxÞ ð36:3Þ E½Lð^yðxÞ; hÞ ¼ ðE½^yðxÞ hÞT C ðE½^yðxÞ hÞ þ trace C ^y P where ^y ðxÞis the variance–covariance matrix of the predicted responses at x and C is a cost matrix related to the costs of non-optimal design. If C is a diagonal matrix then each element represents the relative importance assigned to the corresponding response, that is, the penalty (cost) incurred for each unit of response value deviated from its optimum. If C is a non-diagonal matrix, the off-diagonal elements represent additional costs incurred when pairs of responses are simultaneously off-target. The first term in Eq. 36.3 represents the penalty due to the deviation from the target; the second term represents the penalty due to the quality of predictions. Lee and Kim [13], such as Pignatiello [19] and Wu and Chyu [22], emphasize the bias reduction and the robustness improvement. They proposed minimizing an expected loss defined as E½LðyðxÞ; hÞ ¼

p p X i1 h i X X ^ ij þ ð^yi hi Þð^yj hj Þ ^2i þ ci ð^yi hi Þ2 þ r cij r i¼1

i¼2 j¼1

ð36:4Þ ^2i and r ^ ij are elements where ci and cij represent weights (priorities or costs), and P r of the response’s variance–covariance structure at x ð y ðxÞÞ. A key difference between Eqs. 36.3 and 36.4 is that the later uses the variance–covariance structure of the responses rather than the variance–covariance structure of the predicted responses. Moreover, Lee and Kim’s method requires replicates at each design run, which will certainly increase the time and cost of experimentation. This is not problematic only if the variance due to uncontrollable variables is a trouble in practice. A difficulty with all loss function-based methods is to take into account different scales, relative variabilities and relative costs in matrix C [12, 21].

36.3 Measures of Optimization Performance To evaluate the feasibility of compromise solutions in multiresponse problems, the analyst needs information about the response’s properties at ‘‘optimal’’ variable settings, namely the bias and variance. In fact, responses at some variable settings may have considerable variance due to the uncertainty in the regression coefficients of predicted responses and are sensitive to uncontrollable variables that may be significant and, therefore, cannot be ignored.

450

N. Costa et al.

In the RSM framework few authors have addressed the evaluation of response’s properties to the extent it deserves. In general they focus on the output of the objective function they use. Authors that compare the performance of several methods by evaluating the responses properties at variable settings through optimization performance measures are Lee and Kim [13], Ko et al. [12] and Xu et al. [23]. While Lee and Kim [13] and Ko et al. [12] use the terms or components of the objective function they propose for comparing the results of loss functionbased methods in terms of the desired response’s properties, Xu et al. [23] propose new optimization performance measures. The major shortcomings in the proposals of previous authors are the following: 1. The optimization measures used by Lee and Kim [13] and Ko et al. [12] require the definition of a cost matrix, which is not easy to define or readily available. 2. The optimization measures used by Xu et al. [23] only allow the evaluation of response’s bias. To compare methods results or compromise solutions in multiresponse optimization problems it is necessary to consider the statistical properties of the methods used in addition to response’s bias and variance. In fact, optimization methods may differ in terms of statistical properties and optimization schemes so the evaluation and comparison of the corresponding solutions in a straightforward manner may not be possible. For example, the global desirability values of methods that either minimize or minimize the global desirability are neither comparable directly nor with the result (monetary loss) achieved from a loss function-based method. With the aim at providing useful information to the analyst or decision-maker concerning to desired response’s properties (bias, quality of predictions, and robustness) and to evaluate the solutions obtained from different methods optimization measures are proposed. Those measures allow the separate assessment of the bias, quality of predictions, and robustness, which may help the analyst in achieving a solution of interest and guiding him/her during the optimization process, in particular when quality of predictions and robustness are important issues in practice. In addition, they may also serve to evaluate the solutions obtained from different methods and help the practitioner or researcher in making a more informed decision when he/she is interested in choosing a method for optimizing multiple responses. To assess the method’s solutions in terms of bias, it is suggested an optimization measure that considers the response types, response’s specification limits and deviation of responses from their target. This measure, named cumulative bias (Bcum), is defined as Bcum ¼

p X

Wi ^yi hi

ð36:5Þ

i¼1

where ^yi represents the estimated response value at ‘‘optimal’’ variable settings, hi is the target value and Wi is a parameter that takes into account the

36

Assessing Response’s Bias, Quality of Predictions, and Robustness

451

specification limits and response type of the ith response. This parameter is defined as follows: W ¼ 1=ðU LÞ for STB and LTB response types; W ¼ 2=ðU LÞ for NTB response type. The cumulative bias gives an overall result of the optimization process instead of focusing on the value of a single response, which prevents unreasonable decisions of being taken in some cases [11]. To assess the bias of each response, the practitioner may use the individual bias (Bi) defined as Bi ¼ Wi ^yi hi ð36:6Þ Alternatives to Bcum and Bi are presented by Xu et al. [23]. These authors utilize Wi ¼ 1=hi and consider the mean value for the cumulative bias. For the individual bias they consider Wi ¼ 1. The measure proposed for assessing method’s solutions in terms of quality of predictions is defined by h h X i 1 i T T 1 ð36:7Þ QoP ¼ trace u ðxÞ ¼ trace u x X Q X xj j ^y where xj is the subset of independent variables consisting of the K 9 1 vector of regressors for the ith response with N observationsP on Ki regressors for response, P is X is an Np 9 K block diagonal matrix and Q ¼ IN . An estimate of ^ij ¼ ^eTj ^ej =N, where ê is the residual vector from the OLS estimation of the r response P i; IN is an identity matrix and represents the Kronecker product. To make ^y ðxÞ dimensionless this matrix is multiplied by matrix u, whose diagonal and non-diagonal elements are uii ¼ 1=ðUi Li Þ2 and uir ¼ 1=ðUi Li ÞðUr Lr Þ for i 6¼ r, respectively. QoP is defined under the assumption that seemingly unrelated regression (SUR) method is employed to estimate the regression models (response surfaces) as it yields regression coefficients at least as accurate as those of other popular regression techniques, namely the ordinary and generalized least squares [7, 20]. If the ordinary least squares is used the reader is referred to Vining [21] as this author presents variants of Eq. 36.7 for the case of regression models with equal and different forms. The robustness is assessed by h X i ðxÞ Rob ¼ trace u y

ð36:8Þ

P where y ðxÞ represents the variance–covariance matrix of the ‘‘true’’ responses. Note that replications of the experimental runs are required to assess the robustness and matrix (only considers the specification limits of the variance models, while Lee and Kim [13] and Ko et al. [12] use a cost matrix (C). Although the replicates increase the time and cost of experimentation, they may provide significant improvements in robustness that overbalance or at least compensate the time and cost spent.

452

N. Costa et al.

Bi, Bcum, QoP and Rob are dimensionless ratios, so the worry with the dimensional consistency of responses is cancelled. These measures do not exclude others from being used as well and, in terms of results, the lower their values are, the better the compromise solution will be. In practice, all the proposed measures take values greater than or equal to zero, but zero is the most favorable.

36.4 Examples Two examples from the literature illustrate the utility of the proposed performance measures. The first one considers a case study where the quality of prediction is the adverse condition. In this example the methods introduced by Ch’ng et al. [2] and Vining [21] are used. The second one considers the robustness as adverse condition. In this case the methods introduced by Ch’ng et al. [2] and Lee and Kim [13] are used. Example 1 The responses specification limits and targets for the percent conversion (y1 ) and thermal activity (y2 ) of a polymer are the following: ^y1 80:00 with U1 ¼ h1 ¼ 100; 55:00 ^y2 60:00 with h2 ¼ 57:50. Reaction time (x1), reaction temperature (x2), and amount of catalyst (x3) are the control factors. According to Myers and Montgomery [17], the objective was to maximize the percent conversion and achieve the nominal value for the thermal activity. A central composite design with six axial and six center points, with 1:682 xi 1:682, was run to generate the data. The predicted responses, fitted by the SUR method, are as follows: ^y1 ¼ 81:09 þ 1:03x1 þ 4:04x2 þ 6:20x3 1:83x21 þ 2:94x22 5:19x23 þ 2:13x1 x2 þ 11:38x1 x3 3:88x2 x3 ^y2 ¼ 59:85 þ 3:58x1 þ 0:25x2 þ 2:23x3 0:83x21 þ 0:07x22 0:06x23 0:39x1 x2 0:04x1 x3 þ 0:31x2 x3 The model of the thermal activity includes some insignificant regressors (x2 , x21 , x23 , x1 x2 , x1 x3 , x2 x3 ), so the predicted response has a poor quality of prediction. In particular, this estimated response will have a variance as larger as farther from the origin the variable settings are. The variance–covariance matrix is estimated as 11:12 0:55 ^ R¼ 0:55 1:55

x22 ,

. As regards the results, Table 36.1 shows that the global desirability function (D) yields different values for the same response values (cases I and III). This is not desirable or reasonable and may confound analysts who are focused on D value for making decisions. In contrast, the Bcum and QoP remain unchanged, as it is expectable in these instances. By using these measures the analyst can easily

36

Assessing Response’s Bias, Quality of Predictions, and Robustness

Table 36.1 Results: Example 1 Ch’ng et al.

Vining

Case I

Case II

Case III

Weights

(0.30, 0.70)

(0.50, 0.50)

(0.60, 0.40)

xi

(-0.544, 1.682, -0.599) (95.19, 57.50) D = 0.14 0.24 (0.24, 0.00) 0.08

(-1.682, 1.682, -1.059) (98.04, 55.00) D = 0.35 1.10 (0.10, 1.00) 0.31

(-0.538, 1.682, -0.604) (95.19, 57.50) D = 0.29 0.24 (0.24, 0.00) 0.08

^yi Result Bcum Bi QoP

453

0:100 0:025 0:025 0:500 (-0.355, 1.682, -0.468) (95.24, 58.27) E(loss) = 3.86 0.55 (0.24, 0.31) 0.06

perceive whether the changes he/she made in the weights are either favorable or unfavorable in terms of response values. When Bcum or QoP increase, it means that the changes made in the weights are unfavorable, that is, the value of at least one of the responses is farther from its target, as it is the case of ^y2 in the Vining’s solution, or the quality of predictions is lower, such as occur in case II. Case II serves to illustrate that the analyst can distinguish solutions with larger variability from other(s) with smaller variability, for example the cases I and III, looking at QoP value. Vining’s solution is the best in terms of the QoP value, because x1 and x3 values are slightly closer to the origin than in the other cases, namely the cases I and III, which present the same value of QoP. These results provide evidence that the proposed measures give better indications (information) to the analyst and can help him/her in achieving feasible solutions if the quality of predictions is an adverse condition. Example 2 Lee and Kim [13] assumed that the fitted response functions for process mean, variance and covariance of two quality characteristics are as follows: ^y1 ¼ 79:04 þ 17:74x1 þ 0:62x2 þ 14:79x3 0:70x21 10:95x22 0:10x23 5:39x1 x2 þ 1:21x1 x3 1:79x2 x3 ^1 ¼ 4:54 þ 3:92x1 þ 4:29x2 þ 1:66x3 þ 1:15x21 þ 4:40x22 þ 0:94x23 þ 3:49x1 x2 r þ 0:74x1 x3 þ 1:19x2 x3 ^y2 ¼ 400:15 95:21x1 28:98x2 55:99x3 þ 20:11x21 þ 26:80x22 þ 10:91x23 þ 57:13x1 x2 3:73x1 x3 10:87x2 x3 ^2 ¼ 26:11 1:34x1 þ 6:71x2 þ 0:37x3 þ 0:77x21 þ 2:99x22 0:97x23 r 1:81x1 x2 þ 0:41x1 x3

454

N. Costa et al.

Table 36.2 Results: Example 2 Lee and Kim Case I

Ch’ng et al.

Case II

Case III

Weights (1, 1, 1)

(0.3, 0.5, 0.02)

(0.8, 0.3, 1.0)

xi

(0.79, -0.76, 1.00)

(0.80, -0.77, 1.00)

^yi Var– cov Result Bcum Bi

(97.86, 301.40) (7.80, 22.96, 6.39)

(98.06, 300.32) (7.84, 22.98, 6.39)

(1.00, -1.00, 0.43) (74.22, 346.45) (5.89, 23.12, 4.35)

(98.18, 300.00) (7.86, 22.99, 6.38)

E(loss) = 598.1 1.76 (0.05, 0.78, 0.01, 0.92) 0.053

E(loss) = 283.7 1.75 (0.05, 0.78, 0.00, 0.92) 0.053

E(loss) = 173.9 2.39 (0.64, 0.59, 0.23, 0.92) 0.036

D = 0.53 1.75 (0.05, 0.79, 0.00, 0.92) 0.053

Rob

(0.25, 0.25, 0.15, 0.35) (0.80, -0.75, 1.00)

^12 ¼ 5:45 0:77x1 þ 0:16x2 þ 0:49x3 0:42x21 þ 0:50x22 0:35x23 0:63x1 x2 r þ 1:13x1 x3 0:30x2 x3 In this example it is assumed that the response’s specifications are: ^y1 60 with ^1 10 with L1 ¼ h1 ¼ 0; r ^2 25 U1 ¼ h1 ¼ 100; ^y2 500 with L2 ¼ h2 ¼ 300; r with L2 ¼ h2 ¼ 0; 1 xi 1. As regards the results, Table 36.2 shows that the loss function proposed by Lee and Kim yields different expected loss values for solutions with marginal differences in the response values (cases I and II). In contrast, the Bcum value remains unchanged in these situations, confirming its utility for assessing compromise solutions for multiresponse problems. Moreover, note that the lowest expected loss value is obtained from a solution with the worse values for ^y1 and ^y2 , such as occurs in case III, what is an absurdity. Nevertheless, this example provides evidence that the analyst can recognize more robust solutions (case III) from others with larger variability due to uncontrollable factors (case I, II, and Ch’ng et al.’s solution) looking at the Rob value. Note that Ch’ng et al.’s method yields a solution similar to the cases I and II when appropriate weights are assigned to ^y2 ^2 , remaining unchanged (equal to 0.25) the weights to ^y1 and r ^1 . and r This example confirms that the proposed measures give useful information to the analyst and can help him/her in achieving feasible solutions if the robustness is an adverse condition.

36.5 Discussion The optima are stochastic by nature and understanding the variability of responses is a critical issue for the practitioners. Thus, the assessment of the responses’ sensitivity to uncontrollable factors in addition to estimated responses’ variance

36

Assessing Response’s Bias, Quality of Predictions, and Robustness

455

level at ‘‘optimal’’ variable settings by appropriate measures provides the required information for the analyst evaluating compromise solutions in multiresponse optimization problems. For this purpose the QoP and Rob measures are introduced, in addition to measures for assessing the response’s bias (Bi and Bcum). The previous examples show that the expected loss and global desirability functions may give inconsistent and incomplete information to the analyst about methods solutions, namely in terms of the merit of the final solution and desired responses properties. This is a relevant shortcoming, which is due to the different weights or priorities assigned to responses that are considered in the composite function. Those composite functions yield different results in cases where the solutions are equal or have slightly changes in the response values, such as illustrated in Examples 1and 2. Example 2 also shows that absurd results may occur in loss functions if the elements of matrix C are not defined properly. In fact, the loss coefficients (cij ) play a major role in the achievement of optimal parameter conditions that result in trade-offs of interest among responses [22]. In particular, the non-diagonal elements represent incremental costs incurred when pairs of responses are simultaneously off-target, and have to satisfy theoretical conditions that the practitioner may not be aware of or take into account. Those conditions for symmetric loss functions are: c11 ; c22 0 and 2c11 c22 c12 c11 c22 . When these conditions are not satisfied, worse solutions may produce spuriously better (lower) values in the loss function, as it was illustrated with case III in Example 2. Wu and Chyu [22] provide guidelines for defining the cij for symmetric and asymmetric loss functions, but additional subjective information is required from the analyst. Therefore, if the analyst only focuses on the result of the composite function used for making decisions he/she may ignore a solution of interest or be confounded about the directions for changing weights or priorities to responses as the composite function may give unreliable information. By using the proposed measures the analyst does not have to worry with the reliability of the information as they do not depend on priorities assigned to responses. By this reason, the proposed measures may also serve to compare the performance of methods that use different approaches, for example, between desirability function-based methods and loss functionbased methods. Similarly, they make possible the comparison between methods structured under the same approach but that use different composite functions, as it is the case of Derringer and Suich’s method, where the composite function is a multiplicative function, which must be maximized, and Ch’ng et al.’s method, where the composite function is an additive function, which must be minimized. From a theoretical point of view, methods that consider the responses’ variance level and exploit the responses’ correlation information lead to solutions that are more realistic when the responses have either significantly different variance levels or are highly correlated [12]. However, the previous examples show that the proposed measures can provide useful information to the analyst so that he/she achieves compromise solutions with desired properties at ‘‘optimal’’ settings by using

456

N. Costa et al.

methods that do not consider in the objective function the variance–covariance structure of responses. Nevertheless, it is important to highlight that points in non-convex response surfaces cannot be captured by weighted sums like those represented by the objective functions reviewed here, even if the proposed measures are used. Publications where this and other method’s limitations are addressed include Das and Dennis [4], Mattson and Messac [14]. This means that the proposed optimization measures are not the panacea to achieve optimal solutions. In fact, Messac et al. [15] demonstrated that the ability of an objective function to capture points in convex and concave surfaces depends on the presence of parameters that the analyst can use to manipulate the composite function’s curvature. Although they show that using exponents to assign priorities to responses is a more effective practice to capture points in convex and highly concave surfaces, assigning weights to responses is a critical task in multiresponse problems. It usually involves an undefined trial-and-error weight-tweaking process that may be a source of frustration and significant inefficiency, particularly when the number of responses and control factors is large. So, the need for methods where minimum subjective information is required from the analyst is apparent. A possible choice is the method proposed by Costa [3]. According to this author, besides the low number of weights required from the analyst, the method he proposes has three major characteristics: effectiveness, simplicity and application easiness. This makes the method appealing to use in practice and support the development of an iterative procedure to achieve compromise solutions to multiresponse problems in the RSM framework. Despite the potential usefulness of interactive procedures in finding compromise solutions of interest, due attention has not been paid to procedures that facilitate the preference articulation process for multiresponse problems in the RSM framework.

36.6 Conclusion and Future Work Low bias and minimum variance are desired response’s properties at optimal variable settings in multiresponse problems. Thus, optimization performance measures that can be utilized with the existing methods to facilitate the evaluation of compromise solutions in terms of the desired response’s properties are proposed. They can be easily implemented by the analysts and allow the separate assessment of the bias, quality of predictions, and robustness of those solutions. This is useful as the analyst can explore the method’s results putting emphasis on the property(ies) of interest. In fact, compromise solutions where some responses are more favorable than others in terms of bias, quality of predictions or robustness may exist. In these instances, the analyst has relevant information available to assign priorities to responses and make a more informed decision based on economical and technical considerations. As the assignment of priorities to responses is an open research field, an iterative procedure that considers the results of the proposed optimization measures arises as an interesting research topic.

36

Assessing Response’s Bias, Quality of Predictions, and Robustness

457

References 1. Ayvaz M, Tamer K, Ali H, Ceylan H, Gurarslan G (2009) Hybridizing the harmony search algorithm with a spreadsheet ‘Solver’ for solving continuous engineering optimization problems. Eng Optim 41(12):1119–1144 2. Ch’ng C, Quah S, Low H (2005) A new approach for multiple response optimization. Qual Eng 17(4):621–626 3. Costa N (2010) Simultaneous optimization of mean and standard deviation. Qual Eng 22(3):140–149 4. Das I, Dennis J (1997) A closer look at drawbacks of minimizing weighted sums of objectives for pareto set generation in multicriteria optimization problems. Struct Optim 14(1):63–69 5. Derringer G (1994) A balancing act: optimizing product’s properties. Qual Prog 24:51–58 6. Derringer G, Suich R (1980) Simultaneous optimization of several response variables, J Qual Tech 12(4):214–218 7. Fogliatto F, Albin L (2000) Variance of predicted response as an optimization criterion in multiresponse experiments. Qual Eng 12(4):523–533 8. Gauri S, Pal S (2010) Comparison of performances of five prospective approaches for the multi-response optimization. Int J Adv Manuf Technol 48(12):1205–1220 9. Goh T (2009) Statistical thinking and experimental design as dual drivers of DFSS. Int J Six Sigma Compet Adv 5(1):2–9 10. Kazemzadeh R, Bashiri M, Atkinson A, Noorossana R (2008) A general framework for multiresponse optimization problems based on goal programming. Eur J Oper Res 189(2):421–429 11. Kim K, Lin D (2000) Simultaneous optimization of multiple responses by maximining exponential desirability functions. Appl Stat Ser C 49(3):311–325 12. Ko Y, Kim K, Jun C (2005) A new loss function-based method for multiresponse optimization. J Qual Tech 37(1):50–59 13. Lee M, Kim Y (2007) Separate response surface modeling for multiple response optimization: multivariate loss function approach. Int J Ind Eng 14(2):227–235 14. Mattson C, Messac A (2003) Concept selection using s-Pareto frontiers. AIAA J 41(6):1190– 1198 15. Messac A, Sundararaj G, Tappeta R, Renaud J (2000) Ability of objective functions to generate points on non-convex pareto frontiers. AIAA J 38(6):1084–1091 16. Murphy T, Tsui K, Allen J (2005) A review of robust design methods for multiple responses. Res Eng Des 15(4):201–215 17. Myers R, Montgomery D (2002) Response surface methodology: process and product optimization using designed experiments, 2nd edn. Wiley, New Jersey 18. Myers R, Montgomery D, Anderson-Cook C (2009) Response surface methodology: process and product optimization using designed experiments, 3rd edn. Wiley, New York 19. Pignatiello J (1993) Strategies for robust multiresponse. IIE Trans 25(1):5–15 20. Shah H, Montgomery D, Matthew W (2004) Response surface modeling and optimization in multiresponse experiments using seemingly unrelated regressions. Qual Eng 16(3):387–397 21. Vining G (1998) A compromise approach to multiresponse optimization. J Qual Tech 30(4):309–313 22. Wu F, Chyu C (2004) Optimization of robust design for multiple quality characteristics. Int J Prod Res 42(2):337–354 23. Xu K, Lin D, Tang L, Xie M (2004) Multiresponse systems optimization using a goal attainment approach. IIE Trans 36(5):433–445 24. Yildiz A (2009) A new design optimization framework based on immune algorithm and Taguchi’s method. Comput Ind 60(8):613–620 25. Younis A, Dong Z (2010) Trends, features, and tests of common and recently introduced global optimization methods. Eng Optim 42(8):691–718

Chapter 37

Inspection Policies in Service of Fatigued Aircraft Structures Nicholas A. Nechval, Konstantin N. Nechval and Maris Purgailis

Abstract Fatigue is one of the most important problems of aircraft arising from their nature as multiple-component structures, subjected to random dynamic loads. For guaranteeing safety, the structural life ceiling limits of the fleet aircraft are defined from three distinct approaches: Safe-Life, Fail-Safe, and Damage Tolerance approaches. The common objectives to define fleet aircraft lives by the three approaches are to ensure safety while at the same time reducing total ownership costs. In this paper, the damage tolerance approach is considered and the focus is on the inspection scheme with decreasing intervals between inspections. The paper proposes an analysis methodology to determine appropriate decreasing intervals between inspections of fatigue-sensitive aircraft structures (as alternative to constant intervals between inspections often used in practice), so that risk of catastrophic accident during flight is minimized. The suggested approach is unique and novel in that it allows one to utilize judiciously the results of earlier inspections of fatigued aircraft structures for the purpose of determining the time of the next inspection and estimating the values of several parameters involved in the problem that can be treated as uncertain. Using in-service damage data and taking into account safety risk and maintenance cost at the same time, the above approach has been proposed to assess

N. A. Nechval (&) Department of Statistics, EVF Research Institute, University of Latvia, Raina Blvd 19, Riga, LV-1050, Latvia e-mail: [email protected] K. N. Nechval Department of Applied Mathematics, Transport and Telecommunication Institute, Lomonosov Street 1, Riga, LV-1019, Latvia e-mail: [email protected] M. Purgailis Department of Cybernetics, University of Latvia, Raina Blvd 19, Riga, LV-1050, Latvia e-mail: [email protected] S. I. Ao and L. Gelman (eds.), Electrical Engineering and Applied Computing, Lecture Notes in Electrical Engineering, 90, DOI: 10.1007/978-94-007-1192-1_37, Springer Science+Business Media B.V. 2011

459

460

N. A. Nechval et al.

the reliability of aircraft structures subject to fatigue damage. An illustrative example is given.

37.1 Introduction In spite of decades of investigation, fatigue response of materials is yet to be fully understood. This is partially due to the complexity of loading at which two or more loading axes fluctuate with time. Examples of structures experiencing such complex loadings are automobile, aircraft, off-shores, railways and nuclear plants. While most industrial failures involve fatigue, the assessment of the fatigue reliability of industrial components being subjected to various dynamic loading situations is one of the most difficult engineering problems. This is because material degradation processes due to fatigue depend upon material characteristics, component geometry, loading history and environmental conditions. The traditional analytical method of engineering fracture mechanics (EFM) usually assumes that crack size, stress level, material property and crack growth rate, etc. are all deterministic values which will lead to conservative or very conservative outcomes. However, according to many experimental results and field data, even in well-controlled laboratory conditions, crack growth results usually show a considerable statistical variability [1]. The analysis of fatigue crack growth is one of the most important tasks in the design and life prediction of aircraft fatigue-sensitive structures (for instance, wing, fuselage) and their components (for instance, aileron or balancing flap as part of the wing panel, stringer, etc.). Several probabilistic or stochastic models have been employed to fit the data from various fatigue crack growth experiments. Among them, the Markov chain model [2], the second-order approximation model [3], and the modified secondorder polynomial model [4]. Each of the models may be the most appropriate one to depict a particular set of fatigue growth data but not necessarily the others. All models can be improved to depict very accurately the growth data but, of course, it has to be at the cost of increasing computational complexity. Yang’s model [3] and the polynomial model [4] are considered more appropriate than the Markov chain model [2] by some researchers through the introduction of a differential equation which indicates that fatigue crack growth rate is a function of crack size and other parameters. The parameters, however, can only be determined through the observation and measurement of many crack growth samples. Unfortunately, the above models are mathematically too complicated for fatigue researchers as well as design engineers. A large gap still needs to be bridged between the fatigue experimentalists and researchers who use probabilistic methods to study the fatigue crack growth problems. Airworthiness regulations require proof that aircraft can be operated safely. This implies that critical components must be replaced or repaired before safety is

37

Inspection Policies in Service of Fatigued Aircraft Structures

461

compromised. For guaranteeing safety, the structural life ceiling limits of the fleet aircraft are defined from three distinct approaches: Safe-Life, Fail-Safe, and Damage-Tolerant approaches. The common objectives to define fleet aircraft lives by the three approaches are to ensure safety while at the same time reducing total ownership costs. Although the objectives of the three approaches are the same, they vary with regard to the fundamental definition of service life. The Safe-Life approach is based on the concept that significant damage, i.e. fatigue cracking, will not develop during the service life of a component. When the service life equals the design Safe-Life the component must be replaced. The Fail-Safe approach assumes initial damage as manufactured and its subsequent growth during service to detectable crack sizes or greater. Service life in Fail-Safe structures can thus be defined as the time to a service detectable damage. However, there are two major drawbacks to the Safe-Life and Fail-Safe approaches: (1) components are taken out of service even though they may have substantial remaining lives; (2) despite all precautions, cracks sometimes occur prematurely. These facts led the Airlines to introduce the Damage Tolerance approach, which is based on the concept that damage can occur and develop during the service life of a component. In this paper, the Damage Tolerance approach is considered and the focus is on the inspection scheme with decreasing intervals between inspections. From an engineering standpoint the fatigue life of a component or structure consists of two periods: (i) crack initiation period, which starts with the first load cycle and ends when a technically detectable crack is present, and (ii) crack propagation period, which starts with a technically detectable crack and ends when the remaining cross section can no longer withstand the loads applied and fails statically. Periodic inspections of aircraft are common practice in order to maintain their reliability above a desired minimum level. The appropriate inspection intervals are determined so that the fatigue reliability of the entire aircraft structure remains above the minimum reliability level throughout its service life.

37.2 Inspection Scheme Under Fatigue Crack Initiation At first, we consider in this section the problem of estimating the minimum time to crack initiation (warranty period or time to the first inspection) for a number of aircraft structure components, before which no cracks (that may be detected) in materials occur, based on the results of previous warranty period tests on the structure components in question. If in a fleet of k aircraft there are km of the same individual structure components, operating independently, the length of time until the first crack initially formed in any of these components is of basic interest, and provides a measure of assurance concerning the operation of the components in question. This leads to the consideration of the following problem. Suppose we have observations X1, …, Xn as the results of tests conducted on the components; suppose also that there are km components of the same kind to be put into future

462

N. A. Nechval et al.

use, with times to crack initiation Y1, …, Ykm. Then we want to be able to estimate, on the basis of X1, …, Xn, the shortest time to crack initiation Y(1, km) among the times to crack initiation Y1, …, Ykm. In other words, it is desirable to construct lower simultaneous prediction limit, Lc, which is exceeded with probability c by observations or functions of observations of all k future samples, each consisting of m units. In this section, the problem of estimating Y(1,km), the smallest of all k future samples of m observations from the underlying distribution, based on an observed sample of n observations from the same distribution, is considered. Assigning the time interval until the first inspection. Experiments show that the number of flight cycles (hours) at which a technically detectable crack will appear in a fatigue-sensitive component of aircraft structure follows the two-parameter Weibull distribution. The probability density function for the random variable X of the two-parameter Weibull distribution is given by " # d1 d d x x exp ðx [ 0Þ; ð37:1Þ f ðxjb; dÞ ¼ b b b where d [ 0 and b [ 0 are the shape and scale parameters, respectively. The following theorem is used to assign the time interval until the first inspection (warranty period). Theorem 1 (Lower one-sided prediction limit for the lth order statistic of the Weibull distribution). Let X1 \ \ Xr be the first r ordered past observations from a sample of size n from the distribution (37.1). Then a lower one-sided conditional (1- a) prediction limit h on the lth order statistic Yl of a set of m future ordered observations Y1 \ \ Ym is given by _ _ _ _ PrfYl [ hjzg ¼ Pr d ln Yl = b [ d ln h= b jz ¼ PrfWl [ wh jzg " ! Z _P _ l1 X l 1 ð1Þl1j 1 r2 v d ri¼1 ln xi = b v e mj j 0 j¼0 !r # _ _ Xr v _d ln xi = _b v d ln xr = b vwh ðm jÞe þ þ ðn rÞe e dv i¼1 ¼

Xl1

"

j¼0 r X

_

e

! Z _P _ l 1 ð1Þl1j 1 r2 v d ri¼1 ln xi = b v e mj j 0 !r # _

v d ln xi = b

_

_

v d ln xr = d

þ ðn rÞe

dv

i¼1

¼ 1 a; _

_

ð37:2Þ

where b and d are the maximum likelihood estimators of b and d based on the first r ordered past observations (X1,…, Xr) from a sample of size n from the Weibull distribution, which can be found from solution of

37

Inspection Policies in Service of Fatigued Aircraft Structures

_

"

b¼

r X

_

_

463

#, !1=_d

xid þ ðn rÞxrd

ð37:3Þ

;

r

i¼1

and 2 _

d¼4

r X

_

!

_

xid ln xi þ ðn rÞxrd ln xr

i¼1

r X

_

_

xid þ ðn rÞxrd

i¼1

!1

r 1X

r

31 ln xi 5 ;

i¼1

ð37:4Þ _ _ Zi ¼ d ln Xi =b ;

z ¼ ðz1 ; z2 ; . . .; zr2 Þ; _

_

Wl ¼ d lnðYl =bÞ;

i ¼ 1; . . .; r 2; _

_

wh ¼ dln(h=bÞ:

ð37:5Þ ð37:6Þ

(Observe that an upper one-sided conditional a prediction limit h on the lth order statistic Yl may be obtained from a lower one-sided conditional (1-a) prediction limit by replacing 1 - a by a.) h

Proof The proof is given by Nechval et al. [5] and so it is omitted here.

Corollary 1.1 A lower one-sided conditional (1 - a) prediction limit h on the minimum Y1 of a set of m future ordered observations Y1 B B Ym is given by (

._ _ ._ PrfY1 [ hjzg ¼ Pr d ln Y1 b [ d ln h b jz _

R1 ¼

0

) ¼ PrfW1 [ wh jzg

_ Pr _ _ _ _ _ r P vr2 ev d i¼1 lnðxi =b Þ mevwh þ ri¼1 ev d lnðxi =b Þ þ ðn rÞev d lnðxr =b Þ dv _ Pr _ _ _ _ _ r R1 P r lnðxi =b Þ v d lnðxi =b Þ r2 ev d i¼1 þ ðn rÞev d lnðxr =b Þ dv i¼1 e 0 v

¼ 1 a:

ð37:7Þ

Thus, when l = 1 (37.2) reduces to formula (37.7). Theorem 2 (Lower one-sided prediction limit for the lth order statistic of the exponential distribution). Under conditions of Theorem 1, if d = 1, we deal with the exponential distribution, the probability density function of which is given by f ðxjbÞ ¼

1 x exp ðx [ 0Þ: b b

ð37:8Þ

464

N. A. Nechval et al.

Then a lower one-sided conditional (1 - a) prediction limit h on the lth order statistic Yl of a set of m future ordered observations Y1 \ \ Ym is given by Yl h jSb ¼ sb Pr Yl hjSb ¼ sb ¼ Pr Sb sb l1 X l1 1 ¼ PrfWl [ wh g ¼ ð1Þ j Bðl; m l þ 1Þ j¼0 j

1 ¼ 1 a: ðm l þ 1 þ jÞ½1 þ wl ðm l þ 1 þ jÞr

ð37:9Þ

where Wl ¼

Yl ; Sb

wh ¼

h ; sb

Sb ¼

r X

Xi þ ðm rÞXr :

ð37:10Þ

i¼1

Proof It follows readily from standard theory of order statistics that the distribution of the lth order statistic Yl from a set of m future ordered observations Y1 B B Ym is given by 1 ½Fðxl jbÞl1 ½1 Fðxl jbÞml dFðxl jbÞ; f ðyl jbÞdxl ¼ ð37:11Þ Bðl; m l þ 1Þ where FðxjbÞ ¼ 1 expðx=bÞ:

ð37:12Þ

The factorization theorem gives Sb ¼

r X

X i þ ðn rÞXr

ð37:13Þ

i¼1

sufficient for b. The density of Sb is given by 1 sb r1 gðsb jbÞ ¼ ; s exp CðrÞbr b b

sb 0:

ð37:14Þ

Since Yl, Sb are independent, we have the joint density of Yl and Sb as 1 1 1 sb =b ½1 exl =b l1 ½exl =b mlþ1 rþ1 sr1 f ðyl ; sb jbÞ ¼ : ð37:15Þ b e Bðl; m l þ 1Þ CðrÞ b Making the transformation wl = yl/sb, sb = sb, and integrating out sb, we find the density of Wl as the beta density l1 X l1 r ð1Þ j f ðwl Þ ¼ Bðl; m l þ 1Þ j¼0 j This ends the proof.

1 ½ðm l þ 1 þ jÞwl þ 1rþ1

; 0\wl \1:

ð37:16Þ h

37

Inspection Policies in Service of Fatigued Aircraft Structures

465

Corollary 2.1 A lower one-sided conditional (1 - a) prediction limit h on the minimum Y1 of a set of m future ordered observations Y1B BYm is given by

Pr Y1 hjSb ¼ sb ¼ Pr Y1 Sb h sb jSb ¼ sb ¼ PrfW1 [ wh g ¼ 1=ð1 þ mwh Þr ¼ 1 a:

ð37:17Þ

Example Consider the data of fatigue tests on a particular type of structural components (stringer) of aircraft IL-86. The data are for a complete sample of size r = n = 5, with observations of time to crack initiation (in number of 104 flight hours): X1 = 5, X2 = 6.25, X3 = 7.5, X4 = 7.9, X5 = 8.1. Goodness-of-fit testing. It is assumed that Xi, i = 1(1)5, follow the twoparameter Weibull distribution (37.1), where the parameters b and d are unknown. We assess the statistical significance of departures from the Weibull model by performing empirical distribution function goodness-of-fit test. We use the S statistic (Kapur and Lamberson [6]). For censoring (or complete) datasets, the S statistic is given by Pr1 S¼

i¼½r=2þ1

lnðxiþ1 =xi Þ Mi

Pr1 lnðxiþ1 =xi Þ i¼1

Mi

P4 lnðxiþ1 =xi Þ i¼3

¼P

4 i¼1

Mi

¼ 0:184;

ð37:18Þ

lnðxiþ1 =xi Þ Mi

where [r/2] is a largest integer B r/2, the values of Mi are given in Table 13 (Kapur and Lamberson [6]). The rejection region for the a level of significance is {S [ Sn;a}. The percentage points for Sn;a were given by Kapur and Lamberson [6]. For this example, S ¼ 0:184\Sn¼5;a¼0:05 ¼ 0:86:

ð37:19Þ

Thus, there is not evidence to rule out the Weibull model. The maximum likeli_

_

hood estimates of the unknown parameters b and d are b ¼ 7:42603 and d ¼ 7:9081; respectively. Warranty period estimation. It follows from (37.7) that n_ . _ _ . _ o PrfY1 [ hjzg ¼ Pr d ln Y1 b [ d ln h b jz ¼ PrfW1 [ wh jzg ¼ PrfW1 [ 8:4378; zg ¼ 0:0000141389=0:0000148830 ¼ 0:95

ð37:20Þ

and a lower 0.95 prediction limit for Y1 is h = 2.5549 (9104) flight hours, i.e., we have obtained the time interval until the first inspection (or warranty period) equal to 25,549 flight hours with confidence level c = 1 - a = 0.95. Inspection Policy after Warranty Period. Let us assume that in a fleet of m aircraft there are m of the same individual structure components, operating independently. Suppose an inspection is carried out at time sj, and this shows that initial crack (which may be detected) has not yet occurred. We now have to

466

N. A. Nechval et al.

schedule the next inspection. Let Y1 be the minimum time to crack initiation in the above components. In other words, let Y1 be the smallest observation from an independent second sample of m observations from the distribution (37.1). Then the inspection times can be calculated (from (37.23) using (37.22)) as _

_

sj ¼ b expðwsj =dÞ;

j 1;

ð37:21Þ

where it is assumed that s0 = 0, s1 is the time until the first inspection (or warranty period), wsj is determined from PrfY1 [ sj jY1 [ sj1 ; zg n_ ._ _ . __ . _ _ . _ o ¼ Pr d ln Y1 b [ d ln sj b d ln Y1 b [ d ln sj1 b ; z

¼ PrfW1 [ wsj jW1 [ wsj1 ; zg ¼ PrfW1 [ wsj jzg PrfW1 [ wsj1 jzg ¼ 1 a; ð37:22Þ where ._ _ W1 ¼ d ln Y1 b ; _

._ _ wsj ¼ d ln sj b ;

ð37:23Þ

_

b and d are the ML