Multi-armed Bandit Online Learning Based on POMDP in Cognitive Radio

Juan Zhang; Hesong-Jiang; Hong Jiang; Chunmei Chen

216.73.217.72

개인회원 가입

개인회원
기관회원

개인회원 로그인

개인회원 가입으로 더욱 편리하게 이용하세요. 개인회원 가입

아이디/비밀번호를 잊으셨나요? 아이디/비밀번호 찾기

기관회원 로그인

소속기관에서 검색되지 않는 기관은 무료원문다운이 불가능합니다. 개인회원 가입 후 유료구매를 하시거나 소속기관 도서관에 이용문의해 주세요.

Home

Multi-armed Bandit Online Learning Based on POMDP in Cognitive Radio

발행기관

보안공학연구지원센터(IJSH) 바로가기
간행물

International Journal of Smart Home 바로가기
통권

Vol.8 No.3 (2014.05)바로가기
페이지

pp.151-162
저자

Juan Zhang, Hesong-Jiang, Hong Jiang, Chunmei Chen
언어

영어(ENG)
URL

https://www.earticle.net/Article/A230895

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

원문정보

초록

영어: In cognitive radio, most of existing research efforts devoted to spectrum sharing have two weakness as follows. First, they are largely formulated as a Markov decision process (MDP), which requires a complete knowledge of channel. Second, most of the studies are online learning based on perceived channel. To solve the above problems, a new algorithm is proposed in this paper: if the authorized user exists in the current channel, Second user will send conservatively in low rate, or send aggressively. When sending conservatively, the state of the channel is not directly observable, the problem turns out to be Partially Observable Markov Decision Process (POMDP).We first establish the optimal threshold when the channel is known, then consider the optimal transmission when the channel is unknown and model for multi-armed bandit. We get the optimal K-conservative policy through the UCB algorithm and improve the convergence speed by UCB-TUNED algorithm. Simulation and analysis results show that it is the same result of K-conservative policy no matter the multi-armed bandit online learning under not fully known channel or the optimal threshold policy under known channel .At the same time, we improve the convergence speed by UCB-TUNED algorithm.

Abstract
1. Introduction
2. The System Model
  2.1. POMDP Model
  2.2. Channel Modeling based on POMDP
3. The known Channel State of the Optimal Transmission Threshold Strategy
  3.1. K Conservative Strategy Structure Modeling
  3.2. The Challenge of the K Conservative Strategy
  3.3. UCB Algorithm
4. Simulation Results
  4.1. The off-line Algorithm for Optimal Transmission Threshold Strategy
  4.2. Online Learning Algorithm of K Arm Gambling Machine in the Unknown Channel State
5. Conclusions
Acknowledgements
References

키워드

spectrum sharing multi-armed bandit online learning Partially Observable Markov Decision Process

저자

Juan Zhang [ the Open Fund of Robot Technology Used for Special Environment Key Laboratory of Sichuan Province , School of Information Engineering, Southwest University of Science and Technology, Mianyang, China ]
Hesong-Jiang [ the Open Fund of Robot Technology Used for Special Environment Key Laboratory of Sichuan Province , School of Information Engineering, Southwest University of Science and Technology, Mianyang, China ]
Hong Jiang [ the Open Fund of Robot Technology Used for Special Environment Key Laboratory of Sichuan Province , School of Information Engineering, Southwest University of Science and Technology, Mianyang, China ]
Chunmei Chen [ the Open Fund of Robot Technology Used for Special Environment Key Laboratory of Sichuan Province , School of Information Engineering, Southwest University of Science and Technology, Mianyang, China ]

참고문헌

자료제공 : 네이버학술정보

간행물 정보

발행기관

발행기관명

보안공학연구지원센터(IJSH) [Science & Engineering Research Support Center, Republic of Korea(IJSH)]
설립연도
2006
분야
공학>컴퓨터학
소개
1. 보안공학에 대한 각종 조사 및 연구 2. 보안공학에 대한 응용기술 연구 및 발표 3. 보안공학에 관한 각종 학술 발표회 및 전시회 개최 4. 보안공학 기술의 상호 협조 및 정보교환 5. 보안공학에 관한 표준화 사업 및 규격의 제정 6. 보안공학에 관한 산학연 협동의 증진 7. 국제적 학술 교류 및 기술 협력 8. 보안공학에 관한 논문지 발간 9. 기타 본 회 목적 달성에 필요한 사업

간행물

간행물명

International Journal of Smart Home
간기
격월간
pISSN
1975-4094
수록기간
2008~2016
십진분류
KDC 505 DDC 605

이 권호 내 다른 논문 / International Journal of Smart Home Vol.8 No.3

피인용수 : 0건 (자료제공 : 네이버학술정보)

함께 이용한 논문 이 논문을 다운로드한 분들이 이용한 다른 논문입니다.

출처 : 네이버학술정보

0개의 논문이 장바구니에 담겼습니다.

페이지 저장

소속기관 조회

이용자님의 소속기관(단체)이 서비스에 가입되어 있는지 확인해 보십시오.
기관회원에 소속되어 있는 이용자는 원문을 무료로 이용할 수 있습니다.

상호: 주식회사 학술교육원 I 대표: 노방용 I 사업자등록번호: 122-81-88227 I 통신판매업신고번호: 제2008-인천부평-00176호 I 정보보호책임자: 이두영
주소: (21319)인천광역시 부평구 영성중로 50 미래타워 701호 I 전화: 0505-555-0740 I 팩스: 0505-555-0741 I 이메일: earticle@earticle.net

음성지원 및 돋보기 서비스

Earticle

Multi-armed Bandit Online Learning Based on POMDP in Cognitive Radio

원문정보

초록

목차

키워드

저자

참고문헌

간행물 정보

발행기관

간행물

이 권호 내 다른 논문 / International Journal of Smart Home Vol.8 No.3

피인용수 : 0건 (자료제공 : 네이버학술정보)

함께 이용한 논문 이 논문을 다운로드한 분들이 이용한 다른 논문입니다.