Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I identify the request queue for a linux block device

I am working on this driver that connects the hard disk over the network. There is a bug that if I enable two or more hard disks on the computer, only the first one gets the partitions looked over and identified. The result is, if I have 1 partition on hda and 1 partitions on hdb, as soon as I connect hda there is a partition that can be mounted. So hda1 gets a blkid xyz123 as soon as it mounts. But when I go ahead and mount hdb1 it also comes up with the same blkid and in fact, the driver is reading it from hda, not hdb.

So I think I found the place where the driver is messing up. Below is a debug output including a dump_stack which I put at the first spot where it seems to be accessing the wrong device.

Here is the code section:

/*basically, this is just the request_queue processor. In the log output that   follows, the second device, (hdb) has just been connected, right after hda   was connected and hda1 was mounted to the system. */  void nblk_request_proc(struct request_queue *q) { struct request *req; ndas_error_t err = NDAS_OK;  dump_stack();  while((req = NBLK_NEXT_REQUEST(q)) != NULL) {     dbgl_blk(8,"processing queue request from slot %d",SLOT_R(req));      if (test_bit(NDAS_FLAG_QUEUE_SUSPENDED, &(NDAS_GET_SLOT_DEV(SLOT_R(req))->queue_flags)))  {         printk ("ndas: Queue is suspended\n");         /* Queue is suspended */ #if ( LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,31) )         blk_start_request(req); #else         blkdev_dequeue_request(req); #endif 

Here is a log output. I have added some comments to help understand what is happening and where the bad call seems to come up.

  /* Just below here you can see "slot" mentioned many times. This is the       identification for the network case in which the hd is connected to the       network. So you will see slot 2 in this log because the first device has       already been connected and mounted. */    kernel: [231644.155503] BL|4|slot_enable|/driver/block/ctrldev.c:281|adding disk: slot=2, first_minor=16, capacity=976769072|nd/dpcd1,64:15:44.38,3828:10   kernel: [231644.155588] BL|3|ndop_open|/driver/block/ops.c:233|ing bdev=f6823400|nd/dpcd1,64:15:44.38,3720:10   kernel: [231644.155598] BL|2|ndop_open|/driver/block/ops.c:247|slot =0x2|nd/dpcd1,64:15:44.38,3720:10   kernel: [231644.155606] BL|2|ndop_open|/driver/block/ops.c:248|dev_t=0x3c00010|nd/dpcd1,64:15:44.38,3720:10   kernel: [231644.155615] ND|3|ndas_query_slot|netdisk/nddev.c:791|slot=2 sdev=d33e2080|nd/dpcd1,64:15:44.38,3696:10   kernel: [231644.155624] ND|3|ndas_query_slot|netdisk/nddev.c:817|ed|nd/dpcd1,64:15:44.38,3696:10   kernel: [231644.155631] BL|3|ndop_open|/driver/block/ops.c:326|mode=1|nd/dpcd1,64:15:44.38,3720:10   kernel: [231644.155640] BL|3|ndop_open|/driver/block/ops.c:365|ed open|nd/dpcd1,64:15:44.38,3724:10   kernel: [231644.155653] BL|8|ndop_revalidate_disk|/driver/block/ops.c:2334|gendisk=c6afd800={major=60,first_minor=16,minors=0x10,disk_name=ndas-44700486-0,private_data=00000002,capacity=%lld}|nd/dpcd1,64:15:44.38,3660:10   kernel: [231644.155668] BL|8|ndop_revalidate_disk|/driver/block/ops.c:2346|ed|nd/dpcd1,64:15:44.38,3652:10    /* So at this point the hard disk is added (gendisk=c6...) and the identifications      all match the network device. The driver is now about to begin scanning the       hard drive for existing partitions. the little 'ed', at the end of the previous      line indicates that revalidate_disk has finished it's job.        Also, I think the request queue is indicated by the output dpcd1 near the very      end of the line.        Now below we have entered the function that is pasted above. In the function      you can see that the slot can be determined by the queue. And the log output      after the stack dump shows it is from slot 1. (The first network drive that was      already mounted.) */          kernel: [231644.155677]  ndas-44700486-0:Pid: 467, comm: nd/dpcd1 Tainted: P           2.6.32-5-686 #1   kernel: [231644.155711] Call Trace:   kernel: [231644.155723]  [<fc5a7685>] ? nblk_request_proc+0x9/0x10c [ndas_block]   kernel: [231644.155732]  [<c11298db>] ? __generic_unplug_device+0x23/0x25   kernel: [231644.155737]  [<c1129afb>] ? generic_unplug_device+0x1e/0x2e   kernel: [231644.155743]  [<c1123090>] ? blk_unplug+0x2e/0x31   kernel: [231644.155750]  [<c10cceec>] ? block_sync_page+0x33/0x34   kernel: [231644.155756]  [<c108770c>] ? sync_page+0x35/0x3d   kernel: [231644.155763]  [<c126d568>] ? __wait_on_bit_lock+0x31/0x6a   kernel: [231644.155768]  [<c10876d7>] ? sync_page+0x0/0x3d   kernel: [231644.155773]  [<c10876aa>] ? __lock_page+0x76/0x7e   kernel: [231644.155780]  [<c1043f1f>] ? wake_bit_function+0x0/0x3c   kernel: [231644.155785]  [<c1087b76>] ? do_read_cache_page+0xdf/0xf8   kernel: [231644.155791]  [<c10d21b9>] ? blkdev_readpage+0x0/0xc   kernel: [231644.155796]  [<c1087bbc>] ? read_cache_page_async+0x14/0x18   kernel: [231644.155801]  [<c1087bc9>] ? read_cache_page+0x9/0xf   kernel: [231644.155808]  [<c10ed6fc>] ? read_dev_sector+0x26/0x60   kernel: [231644.155813]  [<c10ee368>] ? adfspart_check_ICS+0x20/0x14c   kernel: [231644.155819]  [<c10ee138>] ? rescan_partitions+0x17e/0x378   kernel: [231644.155825]  [<c10ee348>] ? adfspart_check_ICS+0x0/0x14c   kernel: [231644.155830]  [<c10d26a3>] ? __blkdev_get+0x225/0x2c7   kernel: [231644.155836]  [<c10ed7e6>] ? register_disk+0xb0/0xfd   kernel: [231644.155843]  [<c112e33b>] ? add_disk+0x9a/0xe8   kernel: [231644.155848]  [<c112dafd>] ? exact_match+0x0/0x4   kernel: [231644.155853]  [<c112deae>] ? exact_lock+0x0/0xd   kernel: [231644.155861]  [<fc5a8b80>] ? slot_enable+0x405/0x4a5 [ndas_block]   kernel: [231644.155868]  [<fc5a8c63>] ? ndcmd_enabled_handler+0x43/0x9e [ndas_block]   kernel: [231644.155874]  [<fc5a8c20>] ? ndcmd_enabled_handler+0x0/0x9e [ndas_block]   kernel: [231644.155891]  [<fc54b22b>] ? notify_func+0x38/0x4b [ndas_core]   kernel: [231644.155906]  [<fc561cba>] ? _dpc_cancel+0x17c/0x626 [ndas_core]   kernel: [231644.155919]  [<fc562005>] ? _dpc_cancel+0x4c7/0x626 [ndas_core]   kernel: [231644.155933]  [<fc561cba>] ? _dpc_cancel+0x17c/0x626 [ndas_core]   kernel: [231644.155941]  [<c1003d47>] ? kernel_thread_helper+0x7/0x10    /* here are the output of the driver debugs. They show that this operation is      being performed on the first devices request queue. */    kernel: [231644.155948] BL|8|nblk_request_proc|/driver/block/block26.c:494|processing queue request from slot 1|nd/dpcd1,64:15:44.38,3408:10   kernel: [231644.155959] BL|8|nblk_handle_io|/driver/block/block26.c:374|struct ndas_slot sd = NDAS GET SLOT DEV(slot 1)   kernel: [231644.155966] |nd/dpcd1,64:15:44.38,3328:10   kernel: [231644.155970] BL|8|nblk_handle_io|/driver/block/block26.c:458|case READA call ndas_read(slot=1, ndas_req)|nd/dpcd1,64:15:44.38,3328:10   kernel: [231644.155979] ND|8|ndas_read|netdisk/nddev.c:824|read io: slot=1, cmd=0, req=x00|nd/dpcd1,64:15:44.38,3320:10 

I hope this is enough background information. Maybe an obvious question at this moment is "When and where are the request_queues assigned?"

Well that is handled a little bit before the add_disk function. adding disk, is the first line on the log output.

slot->disk = NULL; spin_lock_init(&slot->lock); slot->queue = blk_init_queue(     nblk_request_proc,      &slot->lock ); 

As far as I know, this is the standard operation. So back to my original question. Can I find the request queue somewhere and make sure it is incremented or unique for each new device or does the Linux kernel only use one queue for each Major number? I want to discover why this driver is loading the same queue on two different block storages, and determine if that is causing the duplicate blkid during the initial registration process.

Thanks for looking at this situation for me.

like image 402
ndasusers Avatar asked Jul 22 '11 04:07

ndasusers


People also ask

What command is used to view a Linux block device?

lsblk command is used to display a list of information about all available block devices.

What is a block device?

Block devices are nonvolatile mass storage devices whose information can be accessed in any order. Hard disks, floppy disks, and CD-ROMs are examples of block devices. OpenBoot typically uses block devices for booting.


1 Answers

Queue = blk_init_queue(sbd_request, &Device.lock); 
like image 150
dibin_salher Avatar answered Oct 14 '22 15:10

dibin_salher